: 
Haron E. Burrr, Ohio State University 


j Company 
| Laurence S. McGauouran, University of 
2 Houston 


Quinn McNemar, Stanford University 


Journal of Applied Psychology 


Joun G. Darvey, Editor 
University of Minnesota 


Consulting Editors 


ALEXANDER Mintz, City College of New York 


Harotp F. Rorne, Fairbanks, Morse and 
Company 


Jurian B. Rorrer, Ohio State University 
Tuomas A. Ryan, Cornell University 
Donatp E. Super, Columbia University 
Mites A. Tinker, University of Minnesota 


Aurrep C. Weron, University of New Mexico 


Arruur C. Horrman, Managing Editor 


Heten Orr, Promotion Manager 


Volume 43, 1959 


Published bimonthly by the American Psychological Association, Inc. 
Prince and Lemon Sts., Lancaster, Pa. and 1333 16th St. N.W., 
Washington 6, D. C. 


Rg é Second-class postage paid at Lancaster, Pa. 


© 1959 by the American Psychological Association, Inc. 


$ E w a 


Bureau Edn. | , 


"a Pg 
p 
tTa a | 


— 
LANCASTER PRESS, INC., LANCASTER, PA, 


Contents of Volume 43 


tı an, 1. See McGinnies, E. 
drews, R. S. See Lebo, D. 
\ndrews, T. G. See Whittenberg, J. A. 
sher, J. J., and Evans, R. I. An Investigation of Some Aspects of the Social Psychological 
Impact of an Educational Television Program...... 2.2.0... 00.0 cece durer iruren 166 
er, C. H., and Boyes, G. E. Increasing Probability of Target Detection with a Mirror- 
Ma ERD DLA ttle 2 sas yan Ysa p E AERE AE S EEE see eta ive etl eae Clee. Se ce 195 
Bass, B. M. See Distefano, M. K., Jr. 
Behringer, R. See Ziller, R. C. 
af erdie, RJA A Femininity Adjective Check Lista seors asg meitour Meo s at 327 
Bolda, R. A. See Lawshe, C. H. 
Boling, Jewell, and Fine, S. A. Cues Used by Raters in the Rating of Temperament Re- 
Motirements Of JODS: sames ygs a a Fes 5 alas e ABRs © epaia nace » acoupie a piana asema soem een ue 102 
Bourassa, G. L., and Guion, R. M. A Factorial Study of Dexterity Tests............... 199 
| Boyes, G. E. See Baker, C. H. 
Bradley, J. V. Direction-Of-Knob-Turn Stereotypes 


Wes SBS Gas Fs T E 21 
Browne, C. G. See Kazimier, L. J. 
| Bruce, M. M. See Murray, L. E. 
| Brune, R. L. See Lawshe, C. H. 
Buckner, D. N. The Predictability of Ratings as a Function of Interrater Agreemeat....... 60 
Burdick, H. A., Green, E. J., and Lovelace, J. W. Predicting Trademark Effectiveness..... 285 
Byrne, D. The Effect of a Subliminal Food Stimulus on Verbal Responses............. 249 
Calvin, A. D., and Dollenmayer, Karen S. Subliminal Perception: Some Negative Find- 
FILS a Vis SFY Ss aren Gi S E E EE & ud ia: adie A ah E E E OE D 187 
Carp, Frances M. See DeRath, G. : 
Chalmers, W. E. See Stagner, R. 
Champion, J. M., and Turner, W. W. An Experimental Investigation of Subliminal Per- 
maitin. o see EE T Sante x oceyuic ss arin diorsn =, SEN Sadie A emit yn ee 382 


Cheek, Gloria. See Jacobson, E. i 

Churchill, A. V. Optimal Interval for Visual Interpolation : The Effect of Viewing Distance 125 
Clarke, W. V. See Merenda, P. F. 

Cliff, N., Newman, S. H., and Howell, Margaret A. Selection of Subprofessional Hospital 
REED OLEONINGEs s omi vo sv x ausma TEAN atti WA TSG E ea a scam a be a © oer a omen e 42 


Comrey, A. See Levonian, E. 

Corey, L. G. Psychological Adjustment and the Worker Role : An Analysis of Occupational 

o OROKI N E E A ee 253 
Crites, J. O. A Coding System for Total Profile Analysis of the Strong Vocational Interest 
Iru: ie scose a E E gvecn of weunin a ape A alee 176 


tiie 2 vise nies e eeee ening ew aswelh irae seins F004 FANN E Stee eumavs mouesacunancal 12 
Derber, M. See Stagner, R 
ck en, C. F. Simulated Patterns on the Edwards Personal Preference Schedule......... 372 
tefano, M. K., Jr., and Bass, B. M. Prediction of an Ultimate Criterion of Success as a 
r & ci P Gad BHU Hite w ONOK FE 2 De arose a atoms pares so os 40 

mayer, Karen S. See Calvin, A. D. 

t, L. R., and Glaser, R. Differences between Well and Poorly Adjusted Groups in an 
D EENE. 5 io r ae a ae naie 4 e a a > FASE a « ou « ee CO 271 
trom, W. C., and Powers, Mary E. A Revision of the Study of Values for Use in 
Beane Readership Researchitt. «sa. aas mro nea s 5 ere au anna a a a a cima cami 74 


See Asher, J. J. 5 
S. A. See Boling, Jewell. 
ian, A. J. See Meyer, H. D. 


iii 


iv Contents of Volume 43 


Garvey, W. D., and Henson, Jean B. Interactions between Display Gain and Task-Induced 
Stress im Manual Tracking. 5.0... 8 etic e aac a asus fatey sR Ye RE ee ea ee ee ta p 
Garvey, W. D., and Taylor, F. V. Interactions among Operator Variables, System Dy- 
namics, and Task-Induced Stress 
Glaser, R. See Eilbert, L. R. 
Goodstein, L. D. See Heilbrun, A. B., Jr. 
Green, E. J. See Burdick, H. A. 
Groth, Hilde, and Lyman, J. Effects of Massed Practice and Thickness of Handcoverin; 
on Manipulaton With! GITES u eene 2 sess sajana awen a leinni a aidi arii ie ie a pee BS 
Guion, R. M. See Bourassa, G. L. 
Guttman, I. See Mayo, G. D. 
Hackman, R. C. See Ross, S. 
Hakmiller, K. L. See Kaufman, R. A. 
Hannum, T. E. See Mills, D. H. 
Heilburn, A. B. Jr., and Goodstein, L. D. Relationships between Personal and Socia 
Desirability Sets and Performance on the Edwards Personal Preference Schedule... . 
Henson, Jean B. See Garvey, W. D. 
Hirt, M. Use of the General Aptitude Test Battery to Determine Aptitude Changes with 
Age and to Predict: Job Performance... sisiic« muna a mar ea mate ee cama eared i aaa a vad y 
Horton, D. L. See Mecherikoff, M. $ 
Howell, Margaret A. See Cliff, N. f. 
Izard, C. E. Personality Correlates of Sociometric Status..............0.000000005 ba 
Jacobson, E., Trumbo, D., Cheek, Gloria, and Nangle, J. Employee Attitudes towar 
Technological Change in a Medium Sized Insurance Company...................: 
Jerison, H. J. Effects of Noise on Human Performance..................0.00005 
Jones, L. V. Prediction of Consumer Purchase and the Utility of Money..... 
Kamenetzky, J. Contrast and Convergence Effects in Ratings of Foods............ 
Kaufman, R. A., Hakmiller, K. L., and Porter, L. W. The Effects of Top and Mi 
Management Sets on the Ghiselli Self-Description Inventory 
Kaufman, R. A. See Porter, L. W. 
Kay, B. R. The Use of Critical Incidents in a Forced-Choice Scale....... 


ie L. J., and Browne, C. G. Comparability of Wonderlic Test Forms in Industris 
esting w? 


gs 


Levy, W. See Levonian, E. 

Lovelace, J. W. See Burdick, H. A. 

Lowe, W. F. See Martindale, R. L. 

Lucier, O. See Lebo, D. 

Lykken, D. T. The GSR in the Detection of Guilt 

Lyman, J. See Groth, Hilde. 

McGinnies, E., and Altman, I. Discussion as a Function of Attitudes and Content of 2 
Persuasive Communication........................... 

McGinnies, E. See Page, R. H. 

Madril, E. The Use of IBM Mark-S 
Test Answer Forms 


ense Cards as Multiple-Choice Paper-and-Penci! 


Contents of Volume 43 


Maher, H. Follow-up on the Validity of a Forced-Choice Study Activity Questionnaire in 


E sAnother Settings =z sec2 papin na o gees twee He ree riers ses eek a Daye a Cie Bee Re eR 
Maher, H. Studies of Transparency in Forced-Choice Scales: I. Evidence of Trans- 


ti yo, G. D., and Guttman, I. Faking in a Vocational Classification Situation............ 
_* adow, A., and Parnes, S. J. Evaluation of Training in Creative Problem Solving... .. 
idow, A., Parnes, S. J., and Reese, H. Influence of Brainstorming Instructions and 

Problem Sequence on a Creative Problem Solving Test. ...-.. s... 0 +-eseeeeeeeees 
vherikoff, M., and Horton, D. L. Preferences for Letters of the Alphabet............- 
M nda, P.F., and Clarke, W. V. AVA Validity for Textile Workers................. 

einda, P. F., and Clarke, W. V. The Predictive Efficiency of Temperament Character- 

istics and Personal History Variables in Determining Success of Life Insurance Agents. 

Meyer, H. D., and Fredian, A. J. Personality Test Scores in the Management Hierarchy. 

lills, D. H., and Hannum, T. E. The Transparency of the Taylor Scale of Manifest 

Anxiety in a College Population. ........ 0.6... esse eee eee tte teen eens 
Minor, F. J. See Pepinsky, H. B. 

Murray, L. E., and Bruce, M. M. A Study of the Validity of the Sales Comprehension 
© Test and Sales Motivation Inventory in Differentiating High and Low Production in 
Life Insurance Selling., -ess copu i iria ie miae cees teine e masaa ean o aieis nossa nnny 
sIyers, J. H., and Errett, W. The Problem of Preselection in Weighted Application Blank 

EOS. anar orni KER A EE C4 DRUID Eaa Ea E A EN E AAE E SHOE Haid F 
Nangle, J. See Jacobson, E. 
Newman, S. H. See Cliff, N. 
Northrup, Doris. See Krug, R. E. 

Nye, C. T. See Rothe, H. F. 

Owens, W. A. A Comment on the Recent Study of the Mechanical Comprehension Test 
(CC); bY Rein, DECKER o poici sasan naisao auae a meray ivareta kia Him Meroe ERNE al iee e 
Page, R. H., and McGinnies, Comparison of Two Styles of Leadership in Small Group 
Discussion....... © ama g EYE AWOM 2G OAKS GSES a 4.8 aes NA Bales SLU Are cra aS 

Parnes, S. J. See Meadow, A. 

Pepinsky, H. B., Pauline N., Minor, F. J., and Robin, S. S. Team Productivity and Con- 
tradiction of Management Policy Commitments. ............ 0... crestere rrr 
Pepinsky, Pauline N. See Pepinsky, H. B. 

«orter, C. R. See Margolis, C. 

Porter, L. W. Self-Perceptions of First-Level Supervisors Compared with Upper-Manage- 
ment Personnel and with Operative Line a T econ. AT E ated E a Maree 

orter, L. W., and Kaufman, R. A. Relationships between a Top-Middle Management 

' Self-Description Scale and Behavior in a Group Situation. |... -s-s 55cc crror 
Powers, Mary E. See Engstrom, W. C. 

Procter, D. See Levonian, E. 
ambo, W. W. The Effects of Partial Pairing on Scale Values Derived from the Method of 
| Paired Comparisons......+-+-- neet ccr sssri soriy ag ire eee nets euer rens e stits si 

Reese, H. See Meadow, A. 

Riland, L. H. Relationship of the Guttman Components of Attitude Intensity and Personal 

TERS nn 
Robin, S. S. See Pepinsky, H. B. 

Rodgers, D. A. Personality of the Route Salesman in a Basic Food Industry.. -<-> 
oss, S., Dardano, J., and Hackman, R. C. Conductance Levels during Vigilance Task 
BP enion ee a aa a E ERa 2 Da i a Saree N S E 
Noss, S. See Whittenberg, J. A. 

“othe, H. F., and Nye, C. T. Output Rates among Machine Operators: II. Consistency 

Relatata Methods Of Pays «amis vecne iiae menos ideer a oraaa giogare Sane eine HOSS 
harp, H.C. Effect of Subliminal Cues on Test Results.. |... occie 


246 


94 


183 


345 


379 


279 


235 


417 
369 


vi Contents of Volume 43 


Silverman, R. E. The‘Comparative Effectiveness of Animated and Static Transparencies.. 16 
Simon, Betty Pearl. See Simon, J. R. 


Simon, J. R., and Simon, Betty Pearl. Duration of Movements in a Dial Setting Task as a 
Function of the Precision of Manipulation: -= osusi esce isaac cocer aa aa a e A 389 


Smith, E. E., and Kight, S. S. Effects of Feedback on Insight and Problem Solving 
Efficiency in Training Groups 


Speroff, B. J. Job Satisfaction Study of Two Small Unorganized Plants................ 315 
Spielberger, C. D. Evidence of a Practice Effect on the Miller Analogies Test..... 259 
Sprecher, T. B. A Study of Engineers’ Criteria for Creativity. ........ o.. LL. o.oo 141 


EEES E 306 
Swanson, E. O., and Layton, W. L. Relationship of National Merit Scholarship Screening 
Test Scores to Test Data Obtained Earlier in High School......................... 2 
Sydiaha, D. On the Equivalence of Clinical and Statistical Methods................... 395 
Taylor, F. V. See Garvey, W. D. 
Torrance, E. P. An Experimental Evaluation of “No-Pressure” Influence.............. 109 
Trankell, A. The Psychologist as an Instrument of Prediction........................ 170 
Triandis, H. C. Categories of Thought of Managers, Clerks, and Workers about Jobs 
and People in an Industry... cece cece cece cece. 338 
Triandis, H.C. Cognitive Similarity and Interpersonal Communication in Industry....... 321 
Triandis, H. C. Differential Perception of Certain Jobs and People by Managers, Clerks, 
and Workers in Industry... o.oo. cece cg eeeesvusiseveserestesencecnr.. ce, 221 
ee K., Kubala, A. L., and Cobb, B. B. Development and Validation of Adaptability 
riteria 


Turner, W. W. See Champion, J. M. 
Uhr, L. Sex as a Determinant of Driving Skills: Women Drivers!... . 


Voas, R.B. Vocational Interests of Naval Aviation Cadets: Final Results................ 70 
Wakeley, J. H. Quantification of the Term “Objectionable” as Applied to Colorants in 
a m T aiticciuinnine sosurn saene HGR DGmnes nage ves saenrine 137 
Want, R. The Frames of Reference of Flying: Instr ctorg xc ccur dataw sas rian assem 86 
Whittenberg, J. A., Ross, S., and Andrews, T, G. Effects of Altering Task Components on 
"m a Btual Motor PE CARINE E a on Gate tm 226 
Winick, C. Art Work Versus Photography: An Experimental Study................... 180 
Ziller, R. C., and Behringer, ad : A 


R. Group Persuasion by the Most Knowledgeable Member 


under Conditions of Incubation and Varying Group Size........... . 402 


Journal of Applied Psycho 


| Bureau Sdni. 7” Research | 


olo Í 


ogy 


į Aees. No 


VoL. 43, No. 1 


a developing systematic descriptions of 
nisl particular class of natural phenomena, 
Gn eee approaches have been widely 
Ployed. We may identify these briefly as 
ue typological and the dimensional ap- 
oo One may, to bring order into a 
or he of observations, cluster cases into types; 
fied aoe arange them according to quanti- 
ai imensions and study the relationships 

Ong these dimensions.* 
z esearch on union-management relations 
ad ade some use of types (cf. Harbison 
is oleman, 1951; Selekman, 1949). The 
Pes described were based upon observation 
crite but in neither case were precise 
could h laid down by which establishments 
In th e assigned to a particular category. 
e Illini City studies (University of Illi- 


Nois: : 
| Rls: 1953, 1954) some experimentation with 


i based on a more precise set of opera- 
l City Was reported. Following up the Illini 
Shenk the authors have recently (1957) 

Mined į on a set of empirical clusters deter- 
efere in a purely objective manner by 
ishm nce to quantitative scores for the estab- 
ca ents concerned. The data indicated that 

ningful types could be isolated in this 


| Mi 
i ilti 


Shi i 
a On, and that these types were associated 
environmental variables (Derber, Chal- 


Merg 
Ways. & Stagner, 1958) in logically consistent 


We . 
on caula like to express our appreciation to 
ormer man of Southern Illinois University, and 
ert Ver neute assistants, Herbert Schaffer, Rob- 
John Ti ooy, Robert Mitchell, Sheldon Luskin and 
Collectig Lom who aided in various phases of the 

n and analysis of data. 


THE DIMENSIONALITY OF UNION-MANAGEMENT 
RELATIONS AT THE LOCAL LEVEL 


ROSS STAGNER 
Wayne State University 


MILTON DERBER anv W. ELLISON CHALMERS 


University of Illinois 


It is, however, possible to analyze the same 
data in a strictly dimensional manner. The 
technique of factor analysis permits us to 
break down a pattern of relationships among 
data into independent underlying dimensions. 
This may lead to the combining of several 
variables into a single dimension, or it may 
indicate that a particular raw score is related 
to more than one independent dimension. 

The purpose of the present investigation is 
to apply the factor-analytic method to the 
same data used in the typological investiga- 
tion, in the hope of reaching some conclusion 
as to the relative fruitfulness of the two 
approaches. 

Population and variables studied. The 
analysis is based upon 41 establishments in 
three downstate Illinois communities, ranging 
in size from 73 to 2100 hourly employees, in- 
cluding utilities, service and manufacturing 
enterprises (the latter involving both producer 
and consumer goods), Data were collected 
by intensive interviews with the two top union 
officials and the two top management persons 
in labor relations.” 

The present article is based on statistical 
treatment of 35 variables. Twenty of these 
were scored by consensus of the responses 
given by the four respondents.® In the case 


2A more detailed report of these interview pro- 
cedures and a more extensive definition of the proc- 
ess variables (1-20) have been published elsewhere 
(Derber et al., 1957). 

3 Consensus was operationally defined as agreement 
by at least three out of four respondents on a certain 
point. In the case of “factual” items, one spokes- 
man from each side was reinterviewed if consensus 


4 R. Stagner, M. Derber, and W. E. Chalmers 


tudes, bureaucracy, or mutual understanding, will be 
too simple. Many dimensions must be taken into 
consideration in a satisfactory picture of the rela- 
tionship. 

Factor 1: Management satisfaction. Some support 
for those who have stressed the attitudinal compo- 
nent as the defining characteristic of local relation- 
ships is found in Factor 1. This factor coincides 
rather closely with the scale scores for management 
approval of the union, and for management satis- 
faction with scope and depth of union influence. 
The three other managerial satisfaction scores are 
also represented here. It is not surprising that re- 
sort to pressure tactics in both contract negotiations 
and grievance settlement is relatively rare (negative 
loadings) in establishments which are high on Fac- 
tor 1; this lack of pressure may play a causal role, 
of course, in determining the favorable management 
attitude. (Our data do not permit a causal interpre- 
tation, since they are purely cross-sectional.) It is 
likewise appropriate that establishments high on this 
factor should report pleasant emotional tone in both 
contract and grievance discussions. 

Factor 2: Local settlement of disputes. The next 

most important factor seems to be one of local 
settlement of disputes. The highest loading is on the 
use of arbitrators on grievances (which we have re- 
ferred to as an element in autogeny) and autonomy 
in grievance settlement is very close. Willingness to 
make concessions in contract and grievance settle- 
ments and understanding of basic intentions in nego- 
tiations are also highly loaded on this factor. These 
may well contribute to the process of reaching a 
settlement locally, 
; Establishments high on this factor are also faster 
ìn grievance speed, and management is more satis- 
fied with grievance process. These establishments 
report less pressure in negotiations, less yielding to 
Pressure, and less legalism in settlements, 

Factor 3: Union satisfaction with relations. Man- 
agerial Satisfaction proved to be a single dimension 
ie L RUD eoat ipti as to its implications 
aie dey oe action scores, however, Prove 

le into two components, which 

appear to be, Tespectively, satisfaction with the gen- 
On eee fhe interpersonal relations 
aranne, Outs a bt satisfaction with union 
becomes obvious Unio: ee Sppenzs, 1S Topiè 
i on officers may be quite satis- 

fied on one of these categories, and little or not at 
all content with the other, 

High loadings here are on union satisfaction with 
contract manner, union satisfaction with the griev- 
ance procedure, and to lesser extent, union satisfaction 
with wages and benefits, and union attitude. Sur- 
prisingly enough, management satisfaction on griev- 

ances also has a sizeable loading here. It suggests 
that the union perception of management, as re- 
flected in this kind of factor, favors a relatively 
peaceful grievance procedure, even if conflicts de- 
velop at negotiation time. 

Factor 4: Union achievement. This factor may 

represent the achievements of the local union, or it 


may be confounded with the effects of being part of 
a given industry. It is almost identical with rank 
on hourly earnings (loading of +.83) ; however, the 
very high loadings with union influence scope and 
depth (both +.67) and the substantial emphasis on 
managerial satisfaction with wage level and fringe 
benefits indicate that both general union strength 
and an industry wage pattern, may be involved. 
The norms of both parties as to acceptable levels of 
wages and of union influence will, of course, be 
affected by the pattern in the industry, as well as 
by community and other external variables. 

Factor 5: Bargaining style. While this factor at 
first glance seems lacking in unity, there seems to be 
a plausible interpretation. Establishments high on 
this factor would appear to be those in which the 
bargaining process is quite active, with management 
making many proposals to the union (+.77), with 
considerable pressure and threats (+.48 and +.40), 
with a fair amount of yielding in response to pres- 
sure (+.50). A tendency to appeal to past practices 
rather than to rely on legalisms seems to be asso- 
ciated with this fluid bargaining situation. 

Factor 6: Skill of work force. This factor is most 
closely related to the skill ratio for the production 
workers. It would suggest an industry rather than 
a local determinant. Establishments high on the fac- 
tor have considerable autonomy on contracts (+ .54)- 
Evidently high-skill establishments have considerable 
friction when negotiating contracts; the loadings on 
emotional tone, on conceding in negotiations, and on 
managerial satisfaction are negative and fairly large. 
Union satisfaction has no apparent relation to this 
factor, 

Factor 7; Union satisfaction with achievement. As 
noted above, union satisfaction splits into two inde- 
pendent dimensions. The union officers may be 
satisfied with the economic benefits and contract 
privileges won for their members, even if dissatis- 
fied with the way the relationship functions from 
day to day (since these are orthogonal factors, they 
vary independently). Factor 7 is highly loaded with 
the satisfaction felt by union officers regarding scop¢ 
and depth of influence, as well as satisfaction with 
wages and fringes. The loading on this latter vari: 
able is considerably larger than for Factor 3. Unior 
attitude to management, however, is about equally 
related to Factor 3 and Factor 7; this may indicat! 
that generalized approval derives from both varietie: 
of satisfaction. 

Factor 8: Size. The two principal loadings he 
are on objective size and on speed of grievance settle 
ment (negative), which may well be dependent ol 
size of establishment. The correlated variables (aU 
tonomy, pressure regarding grievances, failure to rel) 
on past practice) are also of such a nature as to P 
readily affected by size. However, this may not 
the best label for the factor. 5 

Factor 9: Legalism. This factor is the most dif 
cult to name. It has only one heavy loading, repre 
senting the successful conclusion of contract are” 
ments without mediation. The next largest loading 
are on legalism in grievance handling (positive 


Dimensionality of Union-Management Relations 5 


avoidance of past practice as a guide; and we have 
speculated that an attitude favoring the orderly 
handling of disagreements, abiding by contracts, etc., 
might underlie this factor. 
` Factor 10: Effective grievance handling. This fac- 
a is characterized by two high loadings on vari- 
ps related to grievances. Reported understanding 
ad party's intentions and favorable emo- 
ed tone in grievance handling seem to define this 
It is sion as one of effective grievance procedure. 
Positively related to union attitude, but not to 
management’s attitude, which is plausible since un- 
lon officials are necessarily much more ego-involved 
in grievances than are managers. Somewhat puzzling 
is the tendency for high use of pressure during con- 
tract Negotiations to go with the favorable grievance 
Situation, 


Discussion 


Before considering the question of types 
versus dimensions, let us note certain general 
aspects of the findings. First of all, it seems 
Clear that factor analysis, a method for identi- 
fying one or more independent linear dimen- 
Slons which best describe several measured 
variables, produces meaningful results when 
applied to union-management relations. Stu- 
dents familiar with this area of study will find 
that most of the 10 factors pick up aspects 


_ Which have been mentioned in the literature 


as important in one connection or another. 
he factored data, however, present certain 
advantages as compared with the direct ob- 
Servation: (@) We find that certain variables 
Which appear discrete are not independent 
and can appropriately be combined. (b) Cer- 
tain variables which appear to belong to- 
Bether are shown to be independent. (c) The 
extent to which an establishment’s “score” on 
a dimension is related to each of several vari- 
ables can be determined quantitatively. 
Another interesting point relates to the 
Variables, size, hourly earnings and skill ratio, 
thrown into the analysis. These three vari- 
ables were not determined by responses to 
Questions about the relationship; they are, 
rather, independent of the process variables 
Which were our chief interest. Each of these 
independent variables turned up with a very 
arge loading in a single factor, an outcome 
Y no means inevitable in terms of the 
Ps Furthermore, not one of these vari- 
‘fies showed a substantial loading on any 
er but its own factor; their effects, in 


other words, were relatively unambiguous.° 
It seems certain that the psychological vari- 
ables such as attitude and satisfaction will al- 
ways have to be interpreted within a context 
influenced by these objective elements. 

It would be tempting to interpret the po- 
sition of Factor 1 (Management Satisfac- 
tion), accounting for more variance than any 
other factor, in terms of the decisive effect of 
management attitudes upon the relationship. 
This would in part be spurious, however, since 
we have loaded our investigation with vari- 
ables of an interpersonal character—under- 
standing, conceding, yielding, emotional tone, 
satisfaction. Within this group, managerial 
attitudes and satisfactions may well be the 
most significant. Had a different combina- 
tion of variables been used, some other fac- 
tor might have the largest loadings. But it 
seems safe to conclude that management atti- 
tude and satisfaction constitute a major di- 
mension of the relationship. 

Finding two dimensions instead of one for 
union satisfaction seems to us important. 
Psychologists are prone to stress perception 
of the relationship as a unit, and to predict 
that satisfaction will be likewise unitary. It 
is clear, at least in these data, that satisfac- 
tion on the part of union officers must be 
treated as two dimensions, one relating to 
union contractual accomplishments and the 
other to the daily interactions with manage- 
ment. This observation clears up some con- 
fusions and can forestall others. 

Similarly, it is worthwhile noting that Fac- 
tors 4, 6, and 8 seem each to be related to 
“industry” characteristics, but in our sta- 
tistics they behave independently. It would 
have been easy to think of “industry” as a 
single variable, but it appears in these data 
that at least two are involved: an industry 
characteristic relating to earnings and union 
influence, and another involving skill ratio. 

Finally, it should be emphasized that we 
do not allege that we have isolated the 10 


6 Students of factor-analytic methods may also be 
interested in the observation that these three vari- 
ables had 10 loadings above .30 in the unrotated 
matrix of 10 factors; but after Quartimax rotation, 
there were only 3 such loadings, each of them much 
larger. riew that rotation leads 


This supports the v ta 
to more meaningful results than the original factor 
pattern. 


6 R. Stagner, M. A E. Chalmers 


fundamental dimensions of the local labor- 
management relationship. It is quite possible 
that further studies will show that Factors 5, 
8, and 9 are inadequately identifed, and that 
Factors 6 and 10 include more than we have 
indicated. The addition of more variables 
may clarify the significance, in terms of in- 
dependent determinants or of dependent con- 
sequences, of all the dimensions. However, 
this analysis certainly gives us a good pic- 
ture of several dimensions which are not 
likely to be discarded by further research, 
particularly 1, 2, 3, 4, and 7. 

Types vs. dimensions. Let us now com- 
ment briefly on the general problem of de- 
veloping “types” of union-management rela- 
tions, as compared with efforts to identify 
underlying dimensions. It was demonstrated 
in our previous article (1957) that empirical 
types can be identified and that they show 
meaningful differences in variables not used 
to define the type. The types employed were 
based on three variables: union influence 
(scope plus depth), pressure (contract pres- 
sure, grievance pressure, and yielding to pres- 
sure) and attitudes (management and union). 
Thus, 7 of the 35 variables studied factorially 
were used to define the types. The four main 
types or clusters located were characterized as 
follows: union-dominated, management-hos- 
tile; management-dominated, union-acquies- 
cent; _management-dominated, union-hostile; 
and high union influence, mutually friendly. 

An examination of the data on the other 
variables led to the conclusion that “the value 
ofa factor may be significantly different when 
associated with one cluster than with an- 
other.” For example, union influence is high 
in both Cluster A and Cluster E, but the 
former includes pressure and hostility, the 
latter including neither. Both show high re- 
liance on past practice in grievance settle- 

ment, but, it seems clear, for different rea- 
sons. Or, to take Clusters B-C and E: emo- 
tional tone is high (favorable) in both, but 
Cluster B-C is composed of weak unions ap- 
parently grateful to a kindly management, 
while Cluster E includes strong unions which 
apear friendly but not submissive. Thus the 
typological analysis seems valuable in that it 
calls attention to the varying significance of 


a specific variable when it is set in the con- 
text of one or another type of relationship. 
Against this it must be observed that the 
dimensional analysis reported here also brings 
out hidden significances in the raw data. We 
have noted the separation of union satisfac- 
tion into two components. The attitude of 
union officials toward management is posi- 
tively correlated with both of these factors; 
but obviously this says that favorable union 
attitude may be related to high achievements 
(as in Cluster A) or to good treatment (as in 
Cluster B-C). Furthermore, we find that fa- 
vorable union attitude is also related to Fac- 
tors 9 and 10; Factor 10 seems related to 
Cluster E, but Factor 9 has no counterpart 


in the type data. Hence, it is quite possible | 


that we may get farther by fractionating @ 
complex component like union attitude by 
the technique of dimensional analysis than 
we can simply by studying its place in €m- 
pirically chosen clusters. 


Emotional tone in contract negotiations 35 _ 


positively loaded on Factors 1 and 3 (Satis- 
faction of Management and of Union) but 
negatively on Factor 6, an industry factor. 
This finding indicates that we should not 
treat emotional tone as a single, unambiguous 
fact, that it has different meanings in the con- 
text of establishments high on 6 as against 
those high on 1 or 3. P 

In other words, multivariate analysis 
(whether by factor analysis, as in the present 
study, or by typology, as in our earlier ar- 
ticle) seems necessary for a full exploitation 
of the data. Univariate analysis, as in com- 
parison of establishments for frequency of 
strikes or for attitudes of conflict, is inade- 
quate. The meaning of a single score varies 
substantially according to the context, the 
kind of union-management relationship withi” 
which it occurs. 

For these reasons, we cannot conclude that 
dimensional analysis based on factors is def 
nitely superior to typing, or vice versa, Both 
of these multivariate techniques, howeve!: 
seem superior to univariate methods. 


Summary 


1. Forty-one establishments were ranked 
on 35 variables (aspects of the union-ma 


Dimensionality of Union-Management Relations T 


agement relationship), rank-difference corre- 
lations computed, and the matrix factor-ana- 
lyzed. 

2. Ten dimensions accounted for most of 
the common variance. Brief interpretations 
were offered for these 10. In some cases, an 
assumed single variable (such as union satis- 
faction) was found to be a complex function 
of several underlying dimensions. 

3. Three variables, size, hourly earnings, 
and skill ratio, determined three independent 
factors, despite the fact that they were out- 
numbered by the variables descriptive of the 
relationship. 

4. A comparison of typological analysis 
with a dimensional approach based on fac- 
tor analysis indicates that each increases the 
amount of information above that derived 
from a univariate analysis. Single variables 
receive differing interpretations according to 
the context (type or factor structure) within 
which they occur. Multivariate analysis is 
judged superior to univariate analysis for this 


kind of study, but it is not feasible to state 
that typological or factorial technique is defi- 
nitely superior. It seems possible that both 
techniques could profitably be utilized in a 
single research design. 


Received May 6, 1958. 


References 


Derber, M., Chalmers, W. E., & Stagner, R. Uni- 
formities and differences in local union-manage- 
ment relationships. Indus. Labor Rel. Rev., 1957, 
11 (Oct.), 56-71. 

Derber, M., Chalmers, W. E., & Stagner, R. Envi- 
ronmental variables and union-management ac- 
commodation. Indus. Labor Rel. Rev., 1958, 11 
(April), 413-428. 

Harbison, F. H. & Coleman, J. R. Goals ‘and 


strategy in collective bargaining. New York: 
Harper, 1951. 
Selekman, B. M. Varieties of labor relations. Har- 


vard Bus. Rev., 1949, 27, 175-199. 

University of Illinois, Institute of Labor and Indus- 
trial Relations. Labor-management relations in 
Illini City. Vol. 1. Case studies. Vol. 2. Ex- 
plorations in comparative analysis. Champaign, 
Ill.: Author, 1953, 1954. 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


THE TRANSPARENCY OF THE TAYLOR SCALE OF 
MANIFEST ANXIETY IN A COLLEGE 
POPULATION 


DAVID H. MILLS 
U. S. Army 


anp THOMAS E. HANNUM 


Iowa State College 


Since the development of the Scale of Mani- 
fest Anxiety by Taylor (1951) this instru- 
ment has been employed in two forms (Tay- 
lor, 1953). One form consists of only the 50 
items, all of which contribute to the anxiety 
scale. The other form consists of a 225-item 
scale which, in addition to the 50 anxiety 
items, includes a number of buffer items. 

Many of the recent studies concerned with 
anxiety have used the shortened scale as the 
measure of anxiety because of the correlation 
found to exist between the two forms. This 
correlation ranges from .68 as reported by 
Taylor (1953) to .95 reported by McCreary 
and Bendig (1954). 

Many of the recent studies employing this 
short form as a measure of anxiety have also 
used college students as subjects. It was felt 
that the scale, particularly when used with 
college students, might be so transparent that 
its validity would depend on the motivations 
of the subject. (If this were true, the results 
of recent studies employing this scale may 
need to be re-evaluated.) There are two as- 
pects of the question. One is: can students 
alter their scores in a desired manner when 
so directed? The other is, will students alter 
their scores in other situations when not spe- 
cifically directed to do so? 

The purpose of this study was to investi- 
gate only the former question of the trans- 
parency of the short form of the Taylor Scale 
of Manifest Anxiety in a college population. 


Procedure 


A total of 190 undergraduate students in an intro- 
ductory psychology course at Iowa State College 
were used as Ss. These Ss were randomly assigned 
to five equal groups of 45. As a result of absentees 
on the days the tests were administered, the original 
groups (n=45) were reduced in size. Since the 
smallest group had 38 Ss, individuals were randomly 


eliminated from the other groups until all groups had 
38 Ss. This equalization of groups was done for 
convenience in statistical analyses. Three weeks after | 
the beginning of the course, the A-scale was adminis- 
tered to the entire group. The only differential treat- 
ment among the groups was in the printed instruc- 
tions attached to the front of their test booklets. | 

The students were orally cautioned to read these 
instructions carefully before turning to the scale 
proper. During the first administration, three of the 
groups had the usual instructions given when the | 
scale is administered under normal conditions (calle 
standard administration hereafter). A fourth group 
had instructions to respond as if they were very 
poorly adjusted persons. The fifth group had in- 
structions to respond as if they were very well-ad- 
justed persons. A month later the scale was read- 
ministered to the same groups under the same con- 
ditions as the first administration except that the in- 
structions were altered for all but Group 1, Or the 
control group as indicated in Table 1. Both ad- 
ministrations were completed prior to any class dis- 
cussion of anxiety. 


Statistical Method 


In order to analyze the effect the differential 
instructions had upon the various group Pe! 
formances several analyses of variance Werê 
computed as indicated in Table 3. 

Pearson product-moment correlations we? 
computed between the first and second ad 
ministrations of the five groups. 


Table 1 


Group Arrangement and Differential Instructions 


Instructions 


Instructions 
on the First 


on the Second 


Group Administration Administration 
1 Standard Standard 
2 Standard Poorly Adjusted 
3 Standard Well Adjusted 
4 Poorly Adjusted Standard 
5 Well Adjusted Standard 


Taylor Scale of Manifest Anxiety 


Table 2 
Means, Variances, and Standard Deviations of Groups for Both Administrations 


Peis Second 
minis- Adminis- Mean Signifi 
Groups tration tration Difference teal 
1. n = 38 16.55 14.61 1.94 <.05 
(Standard-Standard) 36.09 45.81 
6.01 6.77 
2. n= 38 13.13 39.79 —26.66 <.01 
(Standard—Poorly Adjusted) 37.58 154.39 
6.13 12.43 
3. n = 38 14.55 6.68 7.87 <.01 
(Standard-Well Adjusted) 64.90 17.19 
8.05 4.14 
4. n = 38 4111. 10.55 30.55 <.01 
(Poorly Adjusted-Standard) 58.75 38.47 
7.67 6.20 
5, me 38 12.76 16.05 — 4.29 <.01 
(Well Adjusted-Standard) 66.46 70.92 
8.15 8.42 
Table 3 
Summary of All Analyses of Variance Computed 
Difference Between 
First Administration Second Administration F P 
Group 1 (Standard) and Group 1 (Standard) 6.34 <.05 
| Group 4 (Poorly Adjusted) and Group 2 (Poorly Adjusted) 0.31 >.05 
| Group 4 (Well Adjusted) and Group 3 (Well Adjusted) 16.79 <.01 
All 5 Groups Soe 105.49 <.01 
Groups 1, 2 and 3 
(All Standard) Sas 2.13 >.05 
Groups 1, 2 and 3 
| (All Standard) 
and 
Groups 4 and 5 (Poorly and 
ase 128.41 <.01 


Well Adjusted) 
Group 4 (Poorly Adjusted) 


and 
| Group 5 (Well Adjusted) ai 289.31 <.01 


Groups 1, 4 and 5 


m (All Standard) 6.03 <.01 
=e Group 1 (Standard) and 
Groups 4 and 5 (Standard) .084 >.05 


Group 4 (Standard) and 
i Group 5 (Standard) 11.23 <.01 


10 David H. Mills and Thomas E. Hannum 


Table 4 


Correlations Between the Two Administrations 
of the Various Groups 


Group (Instructions) Correlation 
1. (Standard-Standard) 0.72 
2. (Standard—Poorly Adjusted) —0.04 
3. (Standard—Well Adjusted) —0.04 
4. (Poorly Adjusted-Standard) 0.03 
5. (Well Adjusted-Standard) 0.63 


Results 


1. Group means, variances, standard devia- 
tions. A summary table of the means, vari- 
ances, and standard deviations for the first 
and second administrations of the five groups 
is shown in Table 2. As indicated in Table 2, 
the mean of the control group (Group 1) de- 
creased 1.94 from the first to the second ad- 
ministration. The means of the remaining 
four groups all changed in the expected di- 
rection. 

2. Analyses of variance between groups. A 
summary of all the analyses of variances 
which were computed is shown in Table 3. 

3. Coefficients of correlation between both 
administrations of all groups. Pearson prod- 
uct-moment correlation coefficients computed 
between the first and second administration of 
the five groups are shown in Table 4. 

Of the correlations shown in Table 4, only 
two were significantly different from zero. 
One of these was the correlation between the 
two administrations to Group 1, which re- 


ceived standard instructions on both adminis- 
trations. 


Discussion 


The differential administrative instructions 
employed in this study had a very definite 
effect upon test scores. The mean score for 
these groups of college students on the short 
form of the A Scale under standard adminis- 
trative instructions ranged from 10-16. When 
the Ss were instructed to appear poorly ad- 
justed, the mean scores rose to about 40. 
Both groups which received instructions to 
appear well adjusted had means which were 
less than the means of any of the groups 


which received the standard instructions on 
the first administration. 

The groups which received instructions to 
appear poorly adjusted were able to alter their 
scores much more than the groups which re- 
ceived instructions to appear well adjusted. 
There are several possible explanations for 
this greater shift. First, since the means of 
the standard groups were much closer to zero 
(minimum anxiety) than they were to 50 
(maximum anxiety), there was simply less 
room for the groups receiving instructions to 
appear well adjusted to alter their scores 
downward. A second possible explanation 
might be that the Ss on the standard ad- 
ministrations were already making some at- 
tempt to appear well adjusted. There is also 
the possibility that the Ss on the stand- 
ard administrations were, in truth, unusually 
well adjusted. The writers assumed that this 
was not the case. This lack of a significant 
difference between the groups which received 
the standard instructions and the groups 
which received instructions to appear well ad- 
justed on the first administration seems to in- 
dicate that even those Ss who received stand- 
ard instructions were attempting to appear 
favorably. The finding that the group which 
received instructions to appear well adjusted 
on the first administration (Group 5) did not 
appear as well adjusted as the group which 
received the same instructions on the second 
administration (Group 3) may be attributable 
to prior experience with the A Scale. Group 3 
had, at the time of the second administration, 
already taken the A Scale once with standard 
instructions. As a result of this prior experi- 
ence, they may have remembered some of the 
responses they had made on the standard ad- 
ministration and altered them in the well-ad- 
justed direction. Also, as was indicated i” 
the control group (Group 1) which received 
standard instructions on both administrations) 
there appeared to be a significant decrease i} 
scores from the first administration to thé 
second. This decrease may be a result of de’ 
creased self-criticism, 

There was no significant difference betwee? 
the two groups which received instructions t° 
appear poorly adjusted. It would seem, there’ 
fore, that the concept of poor adjustment Ž 


Taylor Scale of Manifest Anxiety 11 


more clearly conceived than that of good ad- 
Justment. 

The score an S made on the A Scale on the 
second administration was not a function of 
his score on the first administration except for 
Groups 1 (standard-standard) and 5 (well 
adjusted-standard). For Group 1, a high cor- 
relation between administrations would be ex- 
pected since adequate test-retest reliability for 
the A Scale has been demonstrated. The cor- 
relation in this case was .73. Why Group 5 
(well adjusted-standard) showed such a high 
Correlation (.63) between administrations is 
Not so apparent. This correlation may simply 
be spurious due to sampling error. It has, 
though, been suggested earlier that an S tak- 
ing the A Scale with standard instructions 
might make an attempt to appear better ad- 
Justed than he, in truth, actually is. Since 
Group 5 on the first administration had re- 
Ceived instructions to appear well adjusted, 
there may be a systematic perseveration from 
this first administration to the second when 
they received standard instructions. The least 
that can be said is that there was some- 
thing systematically occurring within this 
8toup which may be determined by further 
research, 


Summary and Conclusions 


This study attempted to evaluate the trans- 
Parency of the short form of the Taylor Scale 
Manifest Anxiety among a group of col- 
ege students. Five groups of Iowa State Col- 
ege students were given two administrations 
Of the scale with instructions to appear either 


well-adjusted, poorly adjusted, or to take the 
scale honestly. Analyses of the various group 
statistics and selected item statistics led to the 
following conclusions. 

1. A preconceived set, in this case the in- 
structions to appear well or poorly adjusted, 
had a definite effect on the total score on the 
scale. 

2. There is some evidence to support the 
belief that, even under standard instructions, 
Ss made an attempt to appear well adjusted. 

3. The second standard administration of 
the test to the same group resulted in a sig- 
nificant decrease in mean score which may be 
indicative of a decrease in self-criticism due 
to poor experience with the test. 

This study indicated that the short form of 
the Taylor Scale of Manifest Anxiety should 
be used with caution. Because of the scale’s 
transparency, it does not appear to be an in- 
strument sophisticated enough to be adminis- 
tered to colleges or university Ss without a lie 
or suppressor scale, particularly when used in 
any situation in which the S might be moti- 
vated to alter his score in a desirable direction. 


Received February 10, 1958. 


References 


McCreary, J. B., & Bendig, A. W. Comparison of 
the two scales of manifest anxiety. J. consult. 
Psychol., 1954, 18, 206. , ? 

Taylor, Janet A. The relationship of anxiety to 
conditioned eyelid response. J. exp. Psychol, 
1951, 41, 81-92. i ; 

Taylor, Janet A. A personality scale of manifest 
anxiety. J. abnorm. soc. Psychol., 1953, 48, 285- 


290. 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


THE PICTURE-CHOICE TEST AS AN INDIRECT 
MEASURE OF ATTITUDES 


GILBERT DeRATH anp FRANCES M. CARP 


Trinity University 


Indirect measurement of attitudes and in- 
terests has long been the goal of psychologi- 
cal measurement. Direct questionnaire ap- 
proaches have the disadvantage of fakeabil- 
ity and are subject to undesirable influences 
from response set and the varying social ac- 
ceptability of items. Indirect approaches by 
means of projective techniques have been 
handicapped by subjectivity, scoring difficul- 
ties, and problems of interpretation. As an 
approach combining the advantages of these 
two, the suggestion has been made that the 
well-known halo effect in ratings be used to 
measure attitudinal and interest characteris- 
tics of the rater (Campbell, 1950). The in- 
direction of such an approach could be of use 
in studying “public” vs. “private” and per- 
haps “conscious” vs. “unconscious” reactions, 
as well as in predicting subsequent behavior 
from test responses. At the same time, re- 
sponses would be objective and unambiguous 
in regard to scoring. 

Flyer (1951) devised a test to capitalize 
on the bias of the rater in evaluating pictures 
of people. Sixty-four pictures (of men for 
men, and of women for women) are judged 
twice by the S, the first time to select from 
each group of eight pictures the two best and 
the two least liked, and the second time to 
place each picture in one of a predetermined 
set of categories. The rationale is that, be- 
cause of the relatively unstructured nature of 
the stimulus material, the S’s responses will 
be largely determined by the sort of atti- 
tude and identification factors which are im- 
portant in projective techniques, while the 
method of responding has the scoring advan- 
tage of questionnaire-type instruments. 

In preliminary studies Flyer obtained 
promising results. College men and women 
selected the best and least liked from each 
set of pictures, then assigned each picture 
to one of eight behavior categories. An in- 
dependent rating of those categories for 
acceptability-nonacceptability was obtained 


12 


from the same students, using a check list. 
There was a consistent tendency for pictures 
of individuals judged to have unacceptable 
characteristics to be selected as “least liked” 
and for those judged to have acceptable char- 
acteristics to be selected as “best liked,” dem- 
onstrating a nonchance relationship between 
the two types of judgments made about the 
pictures, though the Ss were not aware of any 
connection. A repetition of this study with 
an Air Force officer candidate group gave 
similar results. In addition, for this group 
self-ratings on the behavior variables were 
obtained, and a significant tendency was 
noted for Ss to “like” pictures of individuals 
they rated high on variables on which they 
rated themselves high, and to “dislike” indi- 
viduals whose high trait ratings were on vari- 
ables different from their own. The simple 
affective response to a picture is, then, re- 
lated to the self concept of the viewer (Flyer; 
1952). 

Chambers (1957), using a similar tech- 
nique, reported positive correlations betwee? 
responses to college annual photographs and 
measures of lack of inferiority, ascendancy; 
and self-assertiveness for 18 women and 1° 
men undergraduate students. 


Hypotheses 


An area in which indirect measurement o! 
attitude seems advantageous is that of iP 
group feeling or group identification. 1 
formal observation suggests that Flyer’s Pic 
ture-Choice technique would be useful here’ 
convention behavior, for example, suggest 
that people tend at first glance to react quit 
differently to those perceived as members ° 
their own fraternal or occupational group tha 
to those not so perceived. In this study n 
prediction is made that college students W. 
tend to “like” pictures of individuals P? 
ceived as participants in the occupation 
which they aspire and with which they hav 


The Picture-Choice Test 13 


Table 1 


Mean Picture-Choice Scores of the Six College Major* Groups on Each Occupational Category 


College Major Group 


Occupational 
Category N Bus Engr. F. Arts Educ. Soc. Sci. Phy. Sci. 
Business 33 1.84 —.18 -50 -09 —.24 —.36 
Engineering 32 78 2.50 1.50 52 08 25 
Fine Arts/Literary 8 —-1.27 —1.50 2.25 —1.11 .22 —.63 
Education 53 .00 .25 .50 1.56 .66 -70 
Social Sciences 57 —1.00 —.53  —1.00 —.69 142 27 
Physical Sciences 44 1.15 .93 —.12 .60 .10 1.61 


a R a ` ae: 5 
The sample includes no agriculture majors and the university from which the s: 


of agriculture. 


then, at least partially identified. Further, 
the extent of this rating bias should be in- 
dicative of the strength of identification with, 
or expressed interest in, the occupational field. 


Procedure 


Subjects were 227 male college students, freshmen 


through first-year graduate students. The pictures 
of the Form for Men of Flyer’s Picture-Choice Test 


Were used, with the occupational categories: 


Military Officer Education 
Business Social Sciences 
Engineering Physical Sciences 


Fine Arts/Literary Agriculture 


In small groups of five to 20 the Ss were given 
Picture-Choice booklets and four answer sheets. The 
first sheet contained a brief introduction to the test 
and spaces for name, college major, and college class. 

e second contained the instructions: 


A. Turn to page 1 of the Picture booklet. 
Over the faces of the people shown there. Pick out 
the pictures of the two people you like best for any 
reason at all, Place the numbers of the liked 
tures in the squares at the right ..---+++ 


Look 


pic- 


ample was taken does not have a department 


B. Look over the pictures again and pick out the 
two you like least (dislike most). Place the num- 
bers of these disliked pictures in the squares at the 
right 


On the third page the S classified the pictures into 
the eight occupational categories. On Sheet 4, he 
first indicated the degree of his interest in each of 
the occupations by placing a check mark in the ap- 
propriate column: highly interested, somewhat inter- 
ested, somewhat disinterested, and highly disinter- 
ested. He then ranked the occupations in terms of 
his interest, from first to eighth; and last he circled 
the occupational title most closely related to his 


college major. 
Data Treatment and Results 


Tests were scored by summing algebraically 
the numbers of liked and disliked pictures in 
each category. Mean scores obtained for the 
college majors most closely related to the oc- 
cupations listed are given in Table 1. Each 
college major group was compared with the 
five other groups combined, and chi square 
was computed as a test of independence. 


Table 2 


Mean Picture-Choice Scores for E: 
Mean Picture- 


‘ach College Major Gi 
Choice Scores of Fiv 


roup on Its Related Occupational Category vs. 
e Other Majors Combined 


Mean of Other 


College Maj Mean Picture- 1 
Gea N Choice Score N Groups Combined be] [2] 
Business 33 1.84 194 —.35 10.94 .001 
Engincering 32 2.50 ot oe ss D 
i Arts/Literary 8 oe a 3 ae 5o 
Social Sei FA 142 170 —.48 16.17 ‘001 
ocial Sciences 57 14 ; 7 ao oo 
Physical Sciences 44 1.61 18. £ X * 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


THE COMPARATIVE EFFECTIVENESS OF ANI- | 
MATED AND STATIC TRANSPARENCIES *? 


ROBERT E. SILVERMAN 


New York University 


The present study was concerned with 
evaluating the relative training effectiveness 
of animated transparencies as compared to 
static transparencies. Previous research by 
Swanson and Aukes (1956) and by Torkel- 
son (1954) dealing with the relative effective- 
ness of moving and static devices has not 
provided conclusive answers. However, the 
Torkelson study (1954) did suggest that 
mock-up and cutaways which permit the dem- 
onstration of movement contribute more to 
the understanding of motion concepts than do 
charts or manual illustrations, Obviously, 
training agencies need more than suggestive 
data in order to choose between types of 
training devices, especially when the choice 
must take into account the issue of cost. 

Although the two earlier studies were not 
conclusive, they do point up possibly im- 
portant variables. One of the variables is 
the motion properties of the device to be 
studied. If, as the Torkelson study suggests, 
animated devices improve the understanding 
of motion Concepts, then the number and 
types of moving parts in a device should be 
a relevant variable. We would expect ani- 
mated devices to be more effective than static 
devices when the training situation involves 
devices with many as compared to few mov- 
ing parts. 
forthe eectvene oat of testing 

evens a given training device. 
In the studies cited above, paper and pencil 
tests were employed. These tests have the 
advantage of being easily administered and 
scored, but they also have the disadvantage 
of relying heavily on verbal factors. While 
it is true that knowledge of a phenomenon 


1This research was carried out under Co 
N61339-78, Letter Order No. 2, between New York 
University and the U. S. Naval Training Device 
Center, Port Washington, N. Y. A more detailed 
report is found in NAVTRADEVCEN Technical 
Report 78-1. 

? The author wishes to acknowledge the valuable 
assistance of Sam Glucksberg and Roy Lachman. 


16 


usually includes translating the phenomenon 
into verbal symbols, it does not follow that 
the type of knowledge imparted in many 
training situations can be measured by means 
of exclusively verbal techniques. It would 
seem necessary to consider the purpose of a 
specific training device. If the purpose is 
to impart knowledge of nomenclature or the 
ability to translate into verbal symbols the 
functioning of the device, then verbal paper 
and pencil tests would be appropriate. If 
the purpose is to teach mechanical arrange- 
ments and sequences such that the trainee 
can be given the device to disassemble or be 
given the parts to assemble, then it is prob- 
able that a nonverbal performance test would 
be appropriate. In many instances, devices 
which involve motion fall into the category 
of teaching mechanical systems, and here we 
would expect performance tests to be more 
sensitive indices of training effectiveness than 
the paper and pencil tests. 

The purpose of this study was to compare 
animated and static transparencies using 
three training devices and three methods of 
testing. The three devices differed in the 
number of moving parts and the testing 
methods differed in their emphasis on pe! 
formance as compared to verbal techniques: 
It was predicted that the training effective 
ness of animated devices would be a positive 
function of the number of moving parts in 
the devices. Furthermore, this effectivenes® 
bia be best demonstrated with performant? 
ests. 


Procedure 


drawn from 


in selecting 
: (a) no Previous militar 
Previous experience with 
were volunteers and 
Training devices. 
of the following thr 


y experience; (b) e% 
Weapons. The train’ | 
Were paid $1.50 for their U4 
The training devices consis 43 


ee transparencies: (a) the * 


Animated and Static Transparencies 17 


cal. pistol, device number 29GA8. This animated 
transparency contains 11 moving parts; (b) the 30 
cal. carbine M2, device number 29HB7. This ani- 
mated transparency contains 8 moving parts; (c) 
the .30 cal. rifle M1, device number 29HB6. Only 
the trigger housing group section of this transpar- 
ency was used. This section contains 5 moving parts. 

These devices provide a continuum of motion com- 
plexity based on the number of moving parts in each 
device. At the same time, the devices are similar 
contentwise in that each describes the sequence of 
events involved in cocking and firing a hand weapon. 
These sequences are not identical in all three weap- 
ons, but they are similar enough to make the num- 
ber of moving parts the most apparent difference 
among the devices. 

The animated transparencies also served as static 
transparencies. This was accomplished by present- 
ing various static views of the transparency without 
showing the actual transition movements. Thus, if 
the action of the hammer was being depicted, the 
hammer was shown in the cocked position as a 
Static display, the lighting from the projector inter- 
rupted, and a second view of the hammer in the 
fired position shown. This was in contrast to the 
actual movement of the hammer as it was fired in 
the animated display. This technique insured that 
the only difference between the animated and static 
displays was the factor of movement; variables such 
as the size and color of the transparencies were auto- 
matically controlled. 

An overhead projector was used to present the 
various transparencies. 

Lectures. Tape-recorded lectures accompanied the 
Presentations of the transparencies. The lectures for 
each device were similar in form and in length. Each 
lecture began with a general statement about the 
Weapon and then went on to describe the nomencla- 
ture. The nomenclature was followed by a descrip- 
tion of the functioning sequences and then by a de- 
scription of the various safety mechanisms. The 
functioning sequences and safety mechanisms were 
described three times. Finally the nomenclature was 
reviewed. The lectures lasted from 14 to 16 minutes. 
Much of the lecture material was obtained from the 
field manuals and technical manuals associated with 
the weapons in question.3 £ 

Training situation. The training was carried out 
with groups of 8 to 12 trainees. The trainees were 
first informed that this was to be a standard train- 
ing situation and that they were to pay close atten- 
tion, since they would be tested after training. 

The trainees were assigned to the devices and trans- 
Parency types at random. There were 50 trainees 
assigned to each device. Twenty-five were trained 
with the animated transparency and 25 trained with 
the static transparency. The study was carried out 
in phases, with training and testing on one device 

eing completed before a second device was begun. 
The order of phases was: pistol, rifle, and carbine. 


3 FM 23-35, TM 9-1295 for the .45 cal. pistol; 
FM 23-7, TM 9-1276 for the .30 cal. carbine; FM 
23-5, TM 9-1275 for the .30 cal. rifle M1. 


Two experimenters participated in the training. 
One experimenter operated the device and a second 
monitored the recorder. Operation of the devices 
was correlated with the lecture. With the animated 
devices the experimenter demonstrated the various 
movements described by the lecture and pointed to 
the parts indicated in the lecture. With the static 
devices, the attempt was to simulate the use of static 
slides or overlays. This was accomplished by in- 
serting a plate between the lens and bulb of the 
projector each time a new view was to be shown. 
Cues for this operation came from audible clicks 
which were recorded on the tape with the lecture. 
The static presentations were designed to impart the 
same information as did the animated presentations. 
Therefore any movement was shown in before and 
after phases. In showing the cocking of a weapon, 
a view of the uncocked weapon (the before view) 
was followed by a view of the cocked weapon (the 
after view). This was done using approximately the 
same time relationships as were used in the animated 
presentations. The cues on the tape enabled the ex- 
perimenter to gauge the time relationships. 

Tests. The same three types of tests were used 
for all devices. Test I involved knowledge (in 
verbal symbols) of function. Part A of this test 
consisted of five multiple choice questions dealing 
with maintenance. Trainees were asked to use the 
information from the training situation to trouble 
shoot particular operating difficulties. Part B was 
a fill-in-the-blanks test dealing with rote memory of 
the functioning sequences. There were nine scorable 
items in Part B. 

Test II was a nomenclature test in which trainees 
were asked to label particular parts of the weapon. 
There were 15 scorable items on this test. 

Tests I and II were paper and pencil tests ad- 
ministered in group form. 

Test III was a performance test and was adminis- 
tered individually by calling a trainee out of the 
group testing situation. The individual performance 
tests were given by two experimenters. One experi- 
menter presented the task and the second served as 
a timer and scorer. For the pistol performance test, 
the trainee being tested was seated at a table on 
which a .45 calibre pistol and an unloaded magazine 
had been placed. The trainee was instructed: “Load 
and fire the pistol.” The score was the time elapsed 
from the moment the trainee touched the weapon 
until the proper operations were performed. For 
the second test, the pistol was removed from view 
and cocked with the safety on. The trainee was 
then told: “The pistol is loaded and cocked. Fire 
the pistol.” The pistol was handed to the trainee 
as the instructions were given and timing began 
when the experimenter said the final word in the 
instructions, “pistol.” For the third test, the pistol 
was again removed from view, the slide drawn to 
the rear and locked by the slide stop. The pistol 
was handed to the trainee and he was instructed: 
“Release the slide.” Timing began when the word 
“slide” was said. On all three tasks, if a given 
trainee had not completed the proper operations in 


18 Robert E. Silverman 


Table 1 


Means and SDs of the Error Scores for the 
Function Tests 


Device 
Pistol Carbine Rifle 
re Mean SD Mean SD Mean SD 
Animated 4.60 253 6.76 184 4.60 2.56 
Static 4.84 195 7.08 187 5.20 2.00 


90 seconds, he was shown the correct operations and 
given the score of 90.4 A trainee’s total score was 
the average of three tests. The procedures for the 
rifle and carbine tests were similar to those used 
with the pistol, although the tests differed with re- 
gard to specific content. 


Results and Discussion 


The results of each test are considered 
separately. Table 1 presents the results of 
the function tests in terms of mean error 
scores. The analysis of variance for these 
scores is shown in Table 2. The two tables 
reveal no significant differences between the 
animated and static transparencies. For each 
device the animated transparency produced 
slightly fewer errors, but these differences did 
not approach statistical reliability. There 
were reliable differences among the mean 


ee 
*This convention wa 
tration which might 
view of the fact that n 
rect operations once t 


s adopted to minimize frus- 
affect subsequent tasks, In 
o trainees discovered the cor- 
hey had gone beyond 65 sec- 
onds, the arbitrary assigning of 90 was probably 
not too conservative. However, any bias which may 
have resulted would o 


error scores for the devices. The pistol and 
rifle did not differ from each other, but the 
mean error scores for the carbine were greater 
than the scores for each of the other devices. 
This difference was equally apparent for both 
the static and animated transparencies. 

The relatively poor performance on the 
carbine function test cannot be accounted 
for in terms of motion complexity, since the 
carbine transparency contained eight moving 
parts, in contrast to eleven for the pistol and 
five for the rifle. The absence of an inter- 
action effect indicates that the differences be- 
tween the carbine and the other two devices 
did not depend upon the training condition. 

The mean error scores and the SDs from 
the nomenclature tests are presented in 
Table 3 and the analysis of variance of these 
scores is shown in Table 2. Again there were 
no differences between the two types of trans- 
parencies. However, there were differences 
among the devices. Here, the differences 
were a function of the relatively greater num- 
ber of errors made on the pistol test. This 
effect held for both the animated and static 
transparencies. The pistol was the most com- 
plex of the devices and we might attribute 
the greater number of errors to this fact. 
However, a comparison of the absolute scores 
of the three devices is not justified, in view 
of the fact that the three tests were merely 
similar in form, not identical in relative con- 
tent. This same observation holds in refer- 


ence to the differences noted above in the 
function test. 


perate against the experimental The performance tests were scored in i 
hypotheses, of the average time required to perform t 
Table 2 
Analyses of Variance of Scores for Three Tests 

Function Nomenclature Performance 
Mean Mean Mean 
Source df Square F df Square F df Square F 
Device 2 74.61 15.53** 2 27.29 4.01* 2 47.93 = 
Transparency 1 561 1.17 1 403 = 1 2974.83 9,04** 
Interaction 2 45 =à 2 2.84 — 2 320.09 — 
Within Cells 144 4.81 144 6.80 144 329.05 
Total 149 149 149 
* 05 level of confidence. 


** 01 level of confidence. 


=x 


Animated and Static Transparencies 


Table 3 


Means and SDs of the Error Scores for the 
Nomenclature Tests 


19 


Table 5 
Frequency of Hammer and Trigger Guard Cocking 


Transparency 
Device 
Animated Static 
Pistol Carbine Rifle 5 = 
Trans- Number Using Hammer 5 15 
parency Mean SD Mean SD Mean SD Number Using Trigger Guard 20 10 
Animated 6.48 3.13 4.96 2.34 5.24 2.25 
Static 6.56 2.94 5.72 2.47 5.08 2.02 


three (two for the carbine) tasks. The means 
and SDs of the time scores are presented in 
Table 4 and the analysis of variance is pre- 
sented in Table 2. There were consistent 
differences between the animated and static 
transparencies for each of the devices, with 
the animated transparencies showing the 
shorter performance times. These differences 
were significant at the .001 level of confi- 
dence. The magnitude of the differences be- 
tween the types of transparency appeared to 
be greater for the rifle and least for the 
Pistol. However, this effect was not statisti- 
cally reliable, since the F for interaction was 
less than unity. 

The fact that the rifle produced the great- 
est difference between the animated and static 
transparencies is of interest in spite of the 
absence of a significant interaction effect. 
What makes this difference interesting is that 
the rifle contained the fewest number of mov- 
ing parts, and according to the predictions 
stated above should have been the least sus- 
ceptible to training differences. 

‘An examination of the rifle performance 
test revealed differences in the behavior of 


the trainees in the animated and static con- 
ditions in the first performance task. This 
Table 4 
Means and SDs of the Time Scores for the 
Performance Test 
Device 
Pistol Carbine Rifle 
Trans- fe 
Parency Mean SD Mean SD Mean SD 
Animated 28.12 15.23 25.48 18.52 23.00 16.68 
Static 3336 1718 32.28 18.72 37.68 19.91 


task consisted of requiring the trainees to 
cock and fire the trigger housing unit of the 
rifle. In the training demonstration cocking 
was done by pulling down and then pulling 
up the trigger guard. This action cocked the 
hammer, and it was this operation that was 
scored as correct. If a trainee cocked the 
hammer by pushing down the hammer with- 
out using the trigger guard, he was then 
asked to cock the rifle again, this time with- 
out pushing down on the hammer. The time 
required in the second cocking operation (not 
including the first operation) was used in the 
trainee’s final average. It was noted that 
many trainees from the static condition used 
the hammer method first, while only a few 
trainees from the animated condition used 
the hammer method. Table 5 shows the fre- 
quency of hammer and trigger guard cocking 
for the two types of transparency. 

The chi square for the data in Table 5 was 
8.33, significant at the .01 level of confidence. 

One interpretation of this finding is that 
the animated transparency permitted the 
trainees to observe the simultaneous move- 
ment of the hammer and trigger guard, while 
in the static transparency the majority of 
trainees attended to a single moving part, the 
hammer. It would seem that the animated 
training condition provided more information 
regarding relationships between moving parts. 
This interpretation is a tentative one and de- 
serves further consideration where the prob- 
lem of animated and static training devices is 


concerned. 


Summary and Conclusions 


The results of the two paper and pencil 
tests, one dealing with function and the other 
with nomenclature, indicated that animated 
transparencies were no more effective than 


20 


were static transparencies. However, the per- 
formance tests results did show that the 
animated transparencies were more effective 
training devices than were the statics. These 
findings were not related to the particular de- 
vice, but were found to be equally applicable 
to all three devices. 

The relative superiority of the animated 
transparencies in the performance test was 
predicted from previous considerations of the 
role of verbal factors in training and testing. 
It would appear that the relatively nonverbal 
performance tests are a more sensitive index 
of training than are the standard pencil and 
paper tests. This conclusion must be quali- 
fied in terms of the purpose of the training. 
If the training purpose is to impart knowl- 
edge in verbal symbols, then the static type 
of transparency may be as effective as the 
animated type. 

A further finding from an examination of 
the rifle performance test was that the effec- 
tiveness of the animated transparency be- 


Robert E. Silverman 


comes particularly apparent when more than 
one part of a device is in motion at one time. 
When relationships among moving parts are 
involved, the animated transparency clearly 
appears to be more effective. 

Differences among the absolute test scores 
were observed on the paper and pencil tests. 
However, these differences were very likely a 
function of differences in the content of the 
tests, since they were not systematically re- 
lated to the complexity of the devices. 


Received March 28, 1958. 


References 


Swanson, R. A., & Aukes, L. E.ẹ Evaluation of train- 
ing devices for B-47 fuel, hydraulic, and rudder 
power control systems. USAF Personnel Train. 
Res. Cent., 1956. (AFPTRC-TN-56-2.) 

Torkelson, G. M. The comparative effectiveness of — 
a mockup, cutaway and projected charts in teach- 
ing nomenclature and function of the 40 mm. 
antiaircraft weapon and the Mark 13 type torpedo. 
USN, 1954, Spec. Dev. Tech. Rep., SDC 269-7-100. 


Journal of Applied Psychol 
Vol. 43, No. {,'1959 ti 


DIRECTION-OF-KNOB-TURN STEREOTYPES** 


JAMES V. BRADLEY 


Aero Medical Laboratory, Wright Air Development Center 


Few human engineering principles appear 
to be so firmly entrenched as is the principle 
that rotary controls should turn clockwise to 
increase. There appears, however, to be little 
or no experimental evidence in support of it. 
The present study therefore was undertaken 
to determine whether it corresponds to a true 

population stereotype” or should be regarded 

merely as a convention adopted for purposes 
of standardization. In the case investigated, 
results are qualified by, and appear to be con- 
tingent upon, the fact that the display pre- 
sented feedback information without visible 
movement. 


Method 


The apparatus consisted of a box on the front of 
which were mounted a knob and a light, the latter 
directly above the former. At the beginning of a 
trial, the knob was always set in the center of its 
range of excursion at which point the light was 
either bright, dim or off. If the light was on, turn- 
ing the knob a small distance from the center posi- 
tion produced no perceptible change in brightness. 

OWever, turning it to either end of its range of ex- 
cursion caused the light to brighten perceptibly if it 
had initially been dim or to dim perceptibly if it 
had originally been bright. The apparatus was al- 
Ways set so that turning the knob to an extreme po- 
Sition would accomplish the effect requested of the S. 

The Ss were 150 male and 150 female college stu- 
dents, Each was used for a single trial under a 
Single experimental condition. The S was seated 
facing the front of the box which rested on a table. 

‘wo hundred and forty right-handed Ss received in- 
Structions requiring a change in the brightness of the 
light. There were eight such instructions of which 
the first 26 words were identical. Fifteen males and 
15 females were run under each of these eight condi- 
tions. The common introductory portion of the in- 
Structions and the eight different concluding phrases 
are given below. The letters in parentheses preced- 
ing each concluding phrase will serve as an identify- 
Ing code for this instruction. 
= 

e USAF under 
Contract No AF 33(616)3404 monitored by the Acro 
edical Laboratory, Wright Air Development Cene 
er, Permission is granted for reproduction, pu 4 
torte me and ispossh in whale or in part, by an 
e Unii ti vernment. 

.” This ec ae described in WADC Tech- 
nical Report 57-388 (see References). The present 
article is a condensed form of that report. 


1 This research was supported by th 


The Ss instructions: 


On this box is a light (Experimenter points to 
light.) which is controlled by the knob below it. 
(Experimenter points to knob.) I would like for 
you to take hold of the knob and, 


(I) increase the light, 

(D) decrease the light, 
(MB) make the light brighter, 
(MD) make the light dimmer, 

(IB) increase the brightness of the light, 
(DB) decrease the brightness of the light, 
(ID) increase the dimness of the light, or, 
(DD) decrease the dimness of the light. 


The 60 remaining Ss received a single instruction. 
The light was removed from the box, and the entire 
box was covered by a strip of cardboard through 
which only the knob protruded. The S was in- 
structed as follows: 

(T) Before you is a knob. (Experimenter points 
to knob.) When I say “ready,” reach out 
and turn the knob. . . . Ready. 

Thirty right-handed Ss and 30 left-handed Ss were 
run under this instruction. There were 15 males and 
15 females in each of the above groups. In all of 
the above cases direction of initial knob turn was 
observed and recorded by the experimenter. 

The significance of response frequencies for a single 
category of Ss was obtained from binomial tables 
(Harvard University, 1955). The significance of the 
difference in response pattern between two cate- 
gories of Ss was tested by casting the data into 
a fourfold table and consulting tables (Mainland, 
Herrera, and Sutcliffe, 1956) of probabilities based 
on Fisher’s exact method for small samples and chi 
square with Yates’ correction for large ones. All 
tests were two-tailed. 


Results 


Complete data for this experiment are given 
in Table 1. 

Failure of sex differences to appear. Ninety- 
nine of the 150 males and 97 of the 150 
females turned the knob clockwise. Eighty- 
four of 120 males and 92 of 120 females 
turned the knob in accordance with the hy- 
pothesized clockwise-to-increase—-counterclock- 
wise-to-decrease stereotype. By neither meas- 
urement is the difference in performance be- 
tween the two sexes significant. Nor does a 
significant sex difference appear in the re- 


22 James V. 
sponses to any of the 10 individual condi- 
tions. Therefore, the data for the two sexes 
will be combined in all further comparisons. 
Effect of handedness on direction of knob 
turn when direction of functional change is 
unspecified. Of the 30 right-handed Ss asked 
simply to turn the knob, 27 turned it clock- 
wise. Of an equal number of left-handed Ss 
given the same instruction, only 19 turned 
the knob clockwise. The hypothesis of equal 
frequency of clockwise and counterclockwise 
knob turn can be refuted at far beyond the 
-001 level of significance in the former case, 
but cannot be rejected in the latter Ce 
20). The difference in response between the 
two handedness groups is significant at the 
5% level. It must be concluded, then, that 
there is a very strong “stereotype” for right- 
handed Ss to turn a knob clockwise in a com- 
pletely “unstructured” situation in which the 
knob’s function is unknown and neither the 
intention to “increase” nor the intention to 
“decrease” has been specified. This stereo- 
type is significantly weaker, and may be al- 
together absent, in left-handed Ss. There is 
no indication however that left-handed Ss 
have a “turn counterclockwise” stereotype. 
Persistence of tendency for right-handed 
subjects to turn knob clockwise when direc- 


Bradley 


ence of a visible display and by instructions 
as to the direction of change required in the 
display, the tendency to turn clockwise per- 
sists. Two hundred and forty Ss received 
such direction-of-change instructions, For 
each subgroup receiving a particular instruc- 
tion another equal-sized subgroup received 
the opposite instruction. Therefore, although 
instructions may have biassed the Ss’ re- 
sponses (presumably, but not necessarily, 
in accordance with a clockwise-to-increase- 
counterclockwise-to-decrease stereotype), op- 
Posite biases operated upon equal numbers 
of Ss so that, if the tendencies to “turn clock- 
wise” and “turn counterclockwise” were equal, 
half the responses should have been clockwise 
turns, half counterclockwise turns. Such was 
not the case. Of 240 Ss, 150 turned the 
knob clockwise, 90 counterclockwise, the in- 
equality of response frequencies being sig- 
nificant at beyond the .001 level. It must 
be concluded, therefore, that irrespective 
of any turn-clockwise-to-increase-and-counter- 
clockwise-to-decrease stereotype, there is a 
strong “turn clockwise” stereotype. 

Existence of a strong turn-clockwise-to- 
increase-counterclockwise-to-decrease stereo- 
type. Since the preceding paragraphs have 
demonstrated a “turn clockwise” stereotype 


tion of required functional change is specified. which operates regardless of the direction of 
When the situation is structured by the pres- the required change, this stereotype must 
Table 1 


Number of Subjects Turning Knob Clockwise (C) 
Under Each Instruction 


and Number Turning Counterclockwise (CC) 


Both Sexes Males Females 

Instructions Cc ce G CE g cc 
Increase opt j B ý 1a 1 
Decrease 11 19 4 11 7 8 
Make Brighter 28% 2 14*** 1 14*** 1 
Make Dimmer 14 16 9 6 5 10 
Increase Brightness IBNEK 3 13** 2 15%+* 0 
Decrease Brightness if 21 4 11 5 10 
Increase Dimness 13 17 9 6 4 11 
Decrease Dimness 18 12 8 7 10 5 

Turn Knob 
Right-Handed Subjects DE g 13 2 14*** 1 
Left-Handed Subjects 19 11 10 5 9 6 

* Significant at -05 level fo: 


i r two-tailed test, 
Pr Significant at 01 level for two-tailed test. 


Significant at 001 level for two-tailed test. 


Direction-of-Knob-Turn Stereotypes 


23 


Table 2 


Two-Tai BS pay a 
0-Tailed Binomial Tests of Significance on Proportion of Subjects Turning Knob Clockwise (C) and on 
Proportion Conforming to a Clockwise-to-Increase-Counterclockwise- 
to-Decrease Stereotype (S) 


Combinations of Sig. Si 
Instructions C CC %C Level S NotS %S Levl 
ee = dest 40 20 66.7 01348 48 12 80.0 .0000 
ie 42 18 700 00268 44 16 73.3 00040 
oon 37-2317 09246 49 11 81.7 00000 
ite 31 29 51.7 89742 35 25 58.3 24506 
T+ MB + IB + DD 103 17 85.8 -00000 103 17 85.8 .00000 
+MD + DB +ID 47 73 30.2 02208 73 47 60.8 02208 
Toal 150 90 62.5  .00012 176 64 73.3  .00000 
noexes Sia SS) Sie RS Se SE Se a ee 
ne 76 44 633 0046 S4 36 70.0 00002 
males 74 46 61.7 01338 92 28 76.7 00000 


me ow be subtracted or cancelled out from 
Digit before testing for a clockwise-to- 
type gaan sition ener stereo- 
ing a his can be accomplished by combin- 
quire subject group whose instructions re- 
aiten a clockwise response to conform to the 
Stop Stereotype with an equal-sized subject 
clock Whose instructions require a counter- 
Ckwise response in order to conform to it. 
Hing has been done in Table 2. The results 
si that 176 of the 240 Ss responded in ac- 
een with the hypothesized stereotype. 
Cur J an extreme split of responses would oc- 
ess than once in 100,000 times by chance. 
canth Pothesized stereotype is also signifi- 
abi. supported by each of the subgroups of 
2 except for the ID + DD subgroup. 

t is of interest to learn whether or not the 
YPothesized stereotype is one-sided, i.e., ap- 
ies only when an increase is called for. If 
on Were the case, the general tendency to 

wha clockwise would simply be strengthened 
Othe: an increase was required and unaffected 
hee Of 120 Ss required to make an 
an Tease, 103 turned the knob clockwise. Of 
Gent al number required to make a de- 
result , 73 turned it counterclockwise. Both 
oinin ane significantly (P < .025) in con- 
costes, with a clockwise-to-increase-counter- 
signinc ise-to-decrease hypothesis, but differ 
Conf cantly (P <.001) in the proportion 
t oe ne It must be concluded, then, 

while the hypothesized stereotype OP- 


erates in both directions it is weakened when 
a decrease is required, presumably because 
this throws it into conflict with the “turn 
clockwise” stereotype. 

Effect of phrascology upon strength of 
clockwise-to-increase stereotype. Another fac- 
tor which can be checked is whether or not 
the words “increase” or “decrease” are re- 
quired to evoke the stereotype. There is no 
significant difference in pattern of response 
between any two of the three subgroups, MB 
+ MD, I+ D, and IB + DB. Presumably, 
then, it would be just as effective, but more 
economical, to label a knob “brightness” as 
to label it “increase brightness.” ‘There is, 
however, a difference in response, significant 
at the .01 level, between the IB + DB and 
ID + DD subgroups. The stereotype can be 
reduced in strength, therefore, and perhaps 
even eliminated, but not reversed, by phras- 
ing the operator’s instructions in terms of the 


“antifunction.” 
Discussion 


It has been demonstrated that when ma- 
nipulating a rotary knob to effect changes in 
a motionless display, operators show a strong 
predilection for turning clockwise-to-increase, 
counterclockwise-to-decrease. There is reason 
to believe, however, that this stereotype be- 
comes greatly attenuated when the task is 
structured elaborately enough for other de- 
terminants of direction of knob turn to be 


24 James V. 
present and to compete with it. Specifically, 
data presented in (Bradley: 1954, 1957; 
Holding, 1957; Warrick, 1947) and discussed 
in (Bradley, 1957) indicate that the stereo- 
type, if it exists at all, is not the primary de- 
terminant of direction of knob turn when 
knob rotation causes motion of an observed 
display indicator. 


Summary 


Right-handed Ss were asked to grasp a 
knob and turn it so as to effect a specified 
change in the intensity of a light mounted 
just above it. Equal numbers of Ss were 
asked to increase and to decrease the bright- 
ness of the light, the request being phrased 
in a variety of ways. Two significant tend- 
encies were found. 

First, 73.3% of the Ss turned the knob 
clockwise-to-increase or counterclockwise-to- 
decrease the brightness of the light. This 
tendency was strongest when an increase was 
required and when the function to be con- 
trolled was phrased in positive terms (i.e., as 
“brightness” rather than “dimness”). It was 
not significantly dependent upon the use of 
the words “increase” or “decrease” or upon 
the sex of the operator. Other experiments 
indicate that these results are contingent upon 
the use of a display which presents changes 
in information without visible movement, 


Bradley 


Second, 62.5% of all Ss turned the knob 
clockwise. This general turn-clockwise tend- 
ency was found to persist among an addi- 
tional set of right-handed Ss when the light 
was covered up and the S was asked simply 
to turn the knob; among left-handed Ss (used 
only in this condition) the tendency to turn 
clockwise was not statistically significant. 


Received March 28, 1958. 


References 


Bradley, J. V. Desirable control-display relation- 
ships for moving-scale instruments. 
Develpm. Cent., 1954. (Tech. Rep. 54-423.) 

Bradley, J. V. Direction-of-knob-turn stereotypes. 
Wright Air Develpm. Cent., 1957. (Tech. Rep. 
57-388, ASTIA Document No. AD 130835.) 

Harvard University, Staff of the Computation Labo- 
ratory. Tables of the cumulative binomial prob- 
ability distribution. Cambridge, Mass.: Harvard 
Univer. Press, 1955. 

Holding, D. H. Direction of motion relationships be- 
tween controls and displays moving in different 
planes. J. appl. Psychol., 1957, 41, 93-97, 

Mainland, D., Herrera, L., & Sutcliffe, Marion I. 
Tables for use with binomial samples. Dept. of 
Medical Statist., New York Univer. Coll. of Medi- 
cine, 1956. 

Warrick, M. J. Direction of movement in the use 
of control knobs to position visual indicators. 
USAF Air Materiel Command, Wright-Patterson 
Air Force Base, 1947. (Memorandum Rep. No. 
TSEAA-694-4C.) 


Wright Air | 


Journal ” 
Vol. ri ts p fa lied Psychology 


DEVELOPMENT AND VALIDATION OF ADAPT- 
ABILITY CRITERIA 


DAVID K. TRITES, ALBERT L. KUBALA,! anp BART B. COBB 


School of Aviation Medicine, USAF 


ene research program concerned with 
sary na Sl of predictor tests, it is neces- 
ior to b ave criterion measures of the behav- 
ures, e predicted. Frequently these meas- 
or a as pass-fail in a training program 
eT of job productivity, are readily 
any t e; but Just as frequently it is neces- 
the hek synthesize new criterion measures if 
have w re to be predicted does not already 
latter ell-defined operational specificity. The 
presente as the case when the research to be 
scribe; ed was undertaken. The report de- 
to ie the methods and results of a project 
Air ee multiple criteria for use mM an 
(Kub ie pilot selection research program 
a, a, 1958: Sells: 1951, 1956; Trites, 
iera & Cobb, in press), and the con- 
of tas validation of these criteria by means 
site independent categorization (Brown & 
re S, 1957) of Ss predicted to be at the ex- 
mes of the criterion dimensions. 
in H criterion measures of primary interest 
call cy investigation are related to a construct 
wat adaptability (Sells 1956) which has 
hee ase ostulated to account for certain behav- 
ali served during and subsequent to pilot 
inental The construct refers to canes 
as Ger and motivational characteristics, SUC 
de notional disturbance or program-oriente 
ieee deficit, which contribute to a 
contin Success or failure in training and his 
sich ne adjustment to military flying. As 
of a it may be contrasted with the construct 
or ee which refers primarily to aptitude 
suc ill factors accounting for flying training 
ioc failure and for which a criterion of 
doe in training is most appropriate. 
nferences drawn from the two constructs 
ite tags of previous research (Brown ê 
at S, 1957; Sells: 1951, 1956) suggeste 
Dasg categorization of training outcome as 
or failure, due principally to ability, 


Motiyasy: 
ational, or emotional causes, should be 


l 
Now at the Texas Women’s University. 


differentially related to several relatively in- 
dependent criterion dimensions identified by 
factor analysis of data collected during pilot 
training. This procedure follows the inter- 
pretation of the construct validation process 
given by Campbell and Tyler. “A given sci- 
entific construct has multiple potential opera- 
tional specifications. If, as sampled, the op- 
erational specifications concur, the construct 
and the sampled measurement techniques 
have validity” (Campbell & Tyler, 1957, 
p. 91). In addition, it recognizes the sug- 
gestion recently made by Ghiselli with respect 
to the dimensional aspects of criteria. As he 
indicated, “. - - it would appear that... 


performance on any given job is best de- 
scribed in terms of several dimensions, and 


one dimension is not sufficient” (Ghiselli, 


1956, p. 2). 
Procedure 


Sample 

The 792 Ss forming the basic sample were avia- 
tion cadets in Classes 34-M to 56-E who entered pri- 
mary flight training at Graham Air Base, Marianna, 
Florida, between July 1953 and November 1954. All 
were between the ages of 19 and 28 and had already 
been preselected by a rigorous physical examination 
and a battery of aptitude tests. From these, 377 Ss 
having relatively complete data on all the variables 
considered in the study were available for intensive 
analysis. Subsequent analyses, based upon findings 
with the smaller group, utilized the total sample. 


Variables 

A total of 23 variables, representing information 
collected prior to, during, or at the end of flight 
training, were studied. These may be grouped as: 

1. Test scores resulting from relatively objective 
measuring devices Or objectively verifiable charac- 
teristics of the Ss; eg. Ages Pilot Stanine, Officer 
Quality Stanine, Academic Average, Demerits, and 


Solo Time.? 

ae 
2 Tables giving a complete description of the vari- 
ables, the matrix of intercorrelations, the original 
and final 


the quartimax loadings, 


entroid loadings. ] r r 
a ' are contained in Trites et al. (in 


adjusted loadings ar 
press). 


26 David K. Trites, Albert L. Kubala, and Bart B. Cobb 


Table 1 


Faculty Board Classification System 


Failure Category 


Method of Classification" 


AE—Ability 


ME—Motivational 


Students who are apparently 
little or no evidence of fear 
eliminated because of inabilit: 


Students clearly evidencing a lack of motiv. 

although apparently possessing adequate abi 

no fear or apprehension of flying. Most freq 
1. Repeated violators of training rules. 


well motivated to complete training, with 
or apprehension of flying, and are clearly 
y to meet flying or academic standards. 


ation to complete training 
lity and indicating little or 
uently includes: 


2. Self-initiated eliminations indicating lack of motivation. 


EE—Emotional 


Students clearly evidencing fear 
disabling personality inadequacies. 
1. Self-initiated eliminations because of 
2. Self-initiated eliminations where fligh 


or apprehension of flying or exhibiting 
Most frequently includes: 

“fear of flying.” 

t surgeon has indicated failure 


to be due to disabling personality inadequacies, 


3. Other eliminations where evidence 


indicates real cause to be fear or 


apprehension of flying. 


Ad, E~Administrative 


Students receiving hardship or compassionate discharge or eliminated for 


physical causes where motivation, ability, and emotional status appear 


adequate. 


* These descriptions are condensations of those actually used by the raters (Brown & Trites, 1957). 


2. Ratings of the Ss by (a) peers, 
or (c) experts (Trites & Sells, 
of Judgment, Leadership, Comb; 
and Familiarity, 


histories and ac 
flights, 


Most of the variables represented data available 
by the € d of the primary phase of flight training 
(approximately the first six months). However, 


-Fail in flight training 


(b) superiors, 
1957); eg., ratings 
at Stress Tolerance, 
or evaluation based upon medical 
tual performance during training 


the board decides w 
eliminated or returne 


supporting documents, constitutes the faculty board 
proceedings. 

It has been found (Brown & Trites, 1957) that 
these proceedings contain sufficient information to 
permit reliable classification of each failure into 
ability, motivational, or emotional deficiency cate- 
gories.? Such classifications Were made for failures 
in the present study independently of all the other 
variables investigated. 

A description of the Procedure used to derive thé 
Faculty Board Classifications is Presented in Table 1 


Treatment of Data 


puter programmed for the Thurstone centroid metho! 


1947) and the quartisn 
method of rotation (Neuhaus & Wrigley, 1954 i 


orthogonal simple structure, 


tion of the factors and comput? 
res, Ss were grouped as either P% 


Adaptability Criteria 27 


or, on the basis of their Faculty Board Classifica- 
tion, as ability, motivational, or emotional failures. 
Hypotheses concerning differences in factor scores 
between the different groups were examined for all 
Ss by one-way classification analysis of variance. 


Results 


Findings are presented in two parts. The 
first covers the results of the factor analysis; 
the second describes the evaluation of the hy- 
Potheses derived from consideration of the 
factor structure and the adaptability con- 
struct. 


Factor A nalysis 


Eight factors were extracted from the in- 
tercorrelations of the 22 variables. Nineteen 
Iterations were required to stabilize the com- 
munality estimates to meet an arbitrary cri- 
terion for a change of less than .005 in the 
estimates for each variable. At this point 
the largest value in the residual correlation 
Matrix was .04. After graphic adjustment of 
the quartimax rotations, Factors VI, VII, and 

IIT were dropped from consideration. 

Factor I. The various peer ratings had 
their heaviest loadings on this factor with 7 
of the 12 ratings having loadings greater than 
‘80. Examination of the definitions of the 
Most heavily saturated variables suggested 
that the factor reflected a respectful attitude 
On the part of peers toward those men en- 
doweq with attributes represented in the rat- 
ings. The fact that age had a relatively large 
Positive loading on this factor agreed with 

e interpretation since older men were prob- 
ably perceived as exhibiting superior judg- 
ment, leadership, and so on. Finally, identi- 

cation of a second independent factor, based 
Upon peer ratings, lessened the possibility that 

1S was merely an instrumental factor re- 
ecting only that ratings had been made in 
à similar fashion on similar scales. Conse- 
Quently, the name given the present factor 
Was Peer Respect. r 
Factor II. The three variables with the 
Jighest loadings on this factor were peer rat- 
igs whose definitions suggested an orienta- 
lon toward the group and an interest in 
Working in harmony with others. This im- 
Peg a reciprocal acceptance by peers. The 
Oading of the variable based upon medical 


history fitted this interpretation since it is 
logical, in the context of the pilot training 
program, that men with fewer medical com- 
plaints would be perceived as more accept- 
able on a team, likeable, and cooperative. 

The impression of general acceptability, or 
likeability, as an associate, inherent in the 
principal defining variables, suggested that 
this represented a secondary dimension of 
peer evaluation independent of the more 
clearly defined Peer Respect Factor. There- 
fore, it was named the Peer Acceptance 
Factor. 

Factor III. The defining variables for this 
factor were a score based on the number of 
demerits, ratings by instructors, and a peer 
rating of cooperation. Inasmuch as demerits 
were awarded by a man’s tactical officer in- 
structor and a man with many demerits was 
likely to be perceived by peers and instruc- 
tors as less cooperative and conforming, the 
grouping of variables was understandable. It 
is plausible to assume that a man who was 
resistant to the demands of the training situa- 
tion would have a low score on this factor. 
Hence, it was called the Group Conformity 
Factor. 

Factor IV. Heavy loadings of variables 
representing academic performance, intelli- 
gence, and training outcome suggested that 
the factor be named the Academic Achieve- 
ment Factor. It was considered to represent 
an ability dimension. 

Factor V. The only variables having size- 
able loadings on this factor were those in- 
dicative of actual accomplishment in fly- 
ing. The obvious conclusion was that this 
represented the ability dimension of Flying 
Achievement. 

It is noteworthy that all but one of the five 
factors could be matched by inspection with 
factors extracted in an earlier investigation 
of somewhat different training level criteria 
(Kubala, in press). The unmatched factor, 
Peer Acceptance, was probably confounded 
with the peer factor extracted in the earlier 


study. 


Evaluation of the Criterion Dimensions and 
the Adaptability Construct 


Consideration of the adaptability construct 
and the defining variables for each factor led 


28 David K. Trites, Albert L. Kubala, and Bart B. Cobb 


to the formulation of four specific hypotheses 
about the largest and smallest factor-score 
means within each factor for Ss categorized 
as pass or ability, motivational, or emotional 
fail. The hypotheses, with their rationales, 
were: 

1. Since Ss with the greatest ability and 
adaptability should be among those who com- 
pleted the training program, the pass group 
should have the largest mean of any group 
on all factors. 

2. Since the Flying Achievement and Aca- 
demic Achievement Factors were considered 
ability dimensions, and since ability failures 
are expected to reflect defective ability pri- 
marily, the ability fail group should have the 
smallest mean of any group on these two 
factors. 

3. Previous research (Brown & Trites, 
1957) indicated that men classified as emo- 
tional failures (poor adaptability) tended to 
be eliminated relatively early in the training 
program. Such men should be very evident 
to peers in the close associations of the com- 
petitive pilot training environment soon after 
entry. Since the peer ratings used in the 
Present study were obtained within the first 
six weeks of training, it was hypothesized 


Table 2 


Factor Score Means and F Values from Analysis of Variance 


that the emotional fail group should have the 
smallest means of any group on the Peer Re- 
spect and Peer Acceptance Factors. 

4. Unlike the emotional failures, men who 
have been classified as failing for lack of mo- 
tivation tended to be eliminated late in the 
training program and, on measures derived 
from flying performance, they look much 
like pass Ss (Brown & Trites, 1957), Even 
so, they are obviously exhibiting insufficient | 
adaptability to the demands of the situation 
which may be reflected in an overt lack of 
conformity apparent to others and producing | 
sanctions such as demerits, Hence, the mo- 
tivational fail group should have the smallest | 
mean of any group on the Group Conformity 
Factor. 

Scores for the five interpretable factors 
were computed for all Ss in the total sample 
having the required data. Any variable used 
to estimate a factor score and not already in 
stanine form was rescaled to have a mean and 
standard deviation approximately equal to 
those of the stanine scale. Factor scores were 
then obtained by algebraic combination of 
the unweighted scores (Trites & Sells, 1955) 
on the appropriate variables. 


As mentioned previously, the hypotheses 


Failures | 
| 
Motiva- | 
Factor Pass Ability tional Emotional F | 
Peer Respect X 36.5 30.8 31.6 25.2 14.61*** 
N 580 108 55 25 | 
Peer Acceptance x 10.3 9.4 9.5 8.4 4.29 | 
N 580 108 56 26 l 
| 
| 
Group Conformity X 21.5 19.6 17.5 18.6 ii | 
N 583 108 55 26 - 
Academic Achievement x 10.2 7.9 8.9 "ii 
. # 4 9.4 9 
N 583 108 56 27 j 
Flying Achievement Be 10.7 7A 10.0 83 on" | 
A i 3 44. 
N 583 107 50 20 i 
Pid Significant at less than the -01 level, 


ignificant at less than the «001 level, 


Adaptability Criteria 29 


were evaluated by one way classification 
analysis of variance. Table 2 contains the 
F values, means, and the number of Ss in 
each group for each factor. The compari- 
on of the means of the pass, ability fail, 
act a A and emotional fail groups 
hiiji a significant at less than the .01 level. 
s very instance the groups with the largest 
‘ smallest means were those which had 
een hypothesized.* 


Discussions and Conclusions 


cai eetment among the different characteri- 
Stn of adaptability, derived from the fac- 
sü structure and the adaptability construct, 
ie the validity of the construct. Within 
a imits of the study, it has been possible 

define operationally three dimensions of 
adaptability, Peer Respect, Peer Acceptance, 
FA Group Conformity and two dimensions 
7 ability, Flying Achievement and Academic 

Chievement. Through confirmation of the 
p potheses concerning the relationships be- 
er these dimensions and groups of Ss 
a e as pass, ability fail, motivational 
ta and emotional fail, it may be concluded 
‘Aat useful criteria of adaptability have been 
Isolated, 

Further evaluation of the adaptability di- 
Mensions is necessary. It is of particular im- 
Portanee to determine the relation of indi- 
yupi differences among the pass Ss to post- 
raining adaptability assesments. Fortunately, 

ere is evidence (Kubala, in press; Trites 
c ubala, 1957) that criterion assessments 

Orresponding to the Peer Respect and Group 
‘Onformity Factors are marginally, but sig- 
nificantly, related in the expected direction 
2 Posttraining evaluations of adaptability. 
f failures occurred during 
d all of the varjables used 
lected during this same 


t the Faculty Board 
me of the items 


4 Since th iori 
Dri the majority 0 
in re flight training an 
Period factor scores were co 
me © the possibility existed tha 
ers may have been aware of so 


a i . . 
sip Ped in the factor scores. To avoid this pos- 
© lack of independence, the variance analysis was 
nees from the 


rej a 
lapcated using only pass Ss and elimi 


at l 

fail} asic training phase. Although the number of 
Benge ase was greatly reduced, the hypotheses were 
irctally supported by the over-all F tests and the 


1 p f 

the “ton of mean differences, or by the finding that 

pai UY significant differences between individual 

Were a means, using the Scheffé criterion (1959); 

est a hose which had been predicted as being high- 
nd lowest on the factors. 


In addition, an unpublished investigation of 
Officer Effectiveness Reports has revealed a 
significant correlation in the predicted direc- 
tion between demerits accrued during pri- 
mary pilot training and later ratings of effec- 
tiveness. 

On the basis of these findings the Group 
Conformity and the Peer Respect Factors 
have been combined to form a composite 
Adaptability Index. This Index, together 
with the other factor scores, can be used to 
structure samples in order to achieve better 


control for research purposes. 


Summary 


A factor analysis of 22 variables obtained 
for aviation cadets during pilot training re- 
vealed five interpretable factors: Peer Re- 
spect, Peer Acceptance, Group Conformity, 
Academic Achievement, and Flying Achieve- 
ment. Hypotheses derived from the construct 
of adaptability were supported by comparison 
of factor scores for groups of Ss classified ac- 
cording to training outcome as pass, ability 
fail, motivational fail, or emotional fail. This 
was considered evidence for the validity of 
the construct and the usefulness of the cri- 


terion dimensions. 
Received March 28, 1958. 
References 


Brown, W. F., & Trites, D. K. Adaptability screen- 
ing of flying personnel: Early flight behavior as 
an index of subsequent adaptability to flying train- 
ing. Randolph AFB, Texas: School of Aviation 
Medicine, 1957. (USAF Rep. No. 57-114.) 

Campbell, D. T., & Tyler, B. B. The construct va- 
lidity of work-group morale measures. J. appl. 
Psychol., 1957, 41, 91-92. 

Cronbach, L. J., & Meehl, P. E. Construct validity 
in psychological tests. Psychol. Bull, 1955, 52, 
281-302. pom 

Ghiselli, E. E. Dimensional problems of criteria. 
J. appl. Psychol., 1956, 40, 1-4. , 

Kubala, A. L. Adaptability screening of flying per- 
sonnel: Preliminary analysis and validation of cri- 
teria of adaptability to military flying. Randolph 
AFB, Texas: School of Aviation Medicine, 1958 
(in press). (USAF Rep. No. 58-121.) N 

Neuhaus, J. O, & Wrigley, C. The quartimax 
method. Brit. J. stat. Psychol., 1954, 7, 81-91. 

Scheffé, H. A method for judging all contrasts in 
the analysis of variance. Biometrika, 1953, 40, 
87-104. 

Sells, S. B. A research program 
selection of flying personnel. 


on the psychiatric 
I. Methodological 


30 


introduction and experimental design. Randolph 
AFB, Texas: School of Aviation Medicine, 1951. 
(USAF Proj. No. 21-37-002, Rep. No. 1.) 

Sells, S. B. Further developments on adaptability 
screening of flying personnel. J. aviat. Med., 
1956, 27, 440-451. 

Thurstone, L. L. Multiple factor analysis. 
cago: Univer. Chicago Press, 1947. 

Trites, D. K., & Kubala, A. L. Characteristics of 
successful pilots. J aviat. Med., 1957, 28, 34-40. 


Chi- 


‘ mavead Sdn. 
\ paid Ha 


| pated = 
\ Aoa. No- 
+f eT 


eal 


ad 


David K. Trites, Albert L. Kubala, and Bart B. Cobb 


Trites, D. K., Kubala, A. L, & Cobb, B. B. Cri- 
terion dimensions of adaptability to pilot training. 
Randolph AFB, Texas: School of Aviation Medi- 
cine, USAF (in press). 

Trites, D. K., & Sells, S. B. A note on alternative 
methods for estimating factor scores. J. appl- 
Psychol., 1955, 39, 455-456. 

Trites, D. K., & Sells, S. B. Combat performance: 
measurement and prediction. J. appl. Psychol, 
1957, 41, 121-130. 


J 


Journal i 
Vol. 43, ki EO i 


A COMMENT ON THE RECENT STUDY OF THE 


MECHANICAL COMPREHENSION TEST (CC) 
BY R. L. DECKER 


W. A. OWENS 
Iowa State College 


m bien article, intended as a partial 
Test For of the Mechanical Comprehension 
OF und orm CC, indicates a considerable lack 
the cbr a of the purposes for which 
uses, I was developed and of its appropriate 
an ‘te n5 also lacking in accuracy to such 
to be oe that some comments on it appear 
in order. 

T Decker (1958) quotes the writer as 
Ure i hie Form CC was designed to meas- 
cessful e degree of aptitude needed for suc- 
or in performance in engineering courses 
(itali engineering positions after graduation 

ee mine). But what the original article 
tia) ns, 1950) actually said was: “The cen- 
tö cite: of the present investigation is 
Drehe uate this new test of Mechanical Com- 
ton nsion, Form CC, in the potential selec- 

S of engineering students” (italics mine). 

econd, Decker says: 


wag a e basis of the results obtained when the test 
Caladan Beene to 725 incoming freshmen, Owens 
Signific es that the Form CC scores were making a 
ion F independent contribution to the predic- 
Conclusi engineering school grades. Although this 
i see may be supported by the obtained data, 
anil that the amount of this contribution was 
additio; that consideration of scores on the test in 
students to general aptitude test scores in selecting 
ideri; S or employees might not be justifiable con- 
ng the added time and expense involved. 


So 


m is a particularly curious statement since 
Drees, à of the original article deals with this 
Contest Point and indicates the independent 
age tibutions of the ACE, high school aver- 
ain and the MCT to the prediction of cer- 
t ee oe criteria, The most relevant of 
Predicts, Engineering Drawing grade, to the 
4, fon of which the ACE contributes 
MeT the high school average 157 and the 
Cert, 41% of the total predictable variance. 

ainly by the standards of most students 
race surement this last is a substantial and 

ically significant contribution, and the 


test is worth giving if the ACE is worth 
giving. 

Third, Decker used the MCT to predict 
supervisory performance in a group of Ss 
characterized as follows: (a) 80% were col- 
lege graduates in engineering or the physical 
sciences; (b) they had served an average of 
four and one-half years in a large manufac- 
turing organization; and (c) they had all 
been screened with a general aptitude test 
prior to hiring. It goes almost without say- 
ing that MCT was not designed to predict 
supervisory performance; and that even if the 
criterion had been appropriate, the subject 
group is so clearly restricted at a high level 
as to render prediction on the basis of me- 
chanical aptitude difficult if not impossible. 

Fourth, having found a test-criterion cor- 
relation of 0.074, Decker made an item analy- 
sis, He obtained 14 significant item-test cor- 
relations, 3 of which were negative and 11 of 
which were positive (not too strangely the 
median value of the former was — 0.25 and 
that of the latter + .025). Without benefit 
of cross-validation, he then used the 11 items 
as a test, obtained a criterion correlation of 
0.31 and recommended: “further research to 
determine the characteristics of the valid 
items. . . .” In this context, the writer does 
not feel called upon to say poorly what Cure- 
ton (1950) has already said well. 


Received February 28, 1958. 


References 


Cureton, E. E. Validity, reliability and baloney. 
Educ. psychol. Measmt, 1950, 10, 94-96. 

Decker, R. L. A study of the Owens-Bennett Me- 
chanical Comprehension Test (Form CC) as a 
measure of the qualities contributing to successful 

ormance as a supervisor of technical opera- 
tions in an industrial organization. J. appl. Psy- 
chol., 1958, 42, 50-53. 

Owens, W. A» Jr. Adi 
cal Comprehension. 
77-81. 


ifficult new test of Mechani- 
J. appl. Psychol, 1950, 34, 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


RELATIONSHIP OF NATIONAL MERIT SCHOLAR- 
SHIP SCREENING TEST SCORES TO TEST 
DATA OBTAINED EARLIER IN HIGH 
SCHOOL 


EDWARD O. SWANSON asp WILBUR L. LAYTON 


Student Counseling Bureau 


Nation-wide attention has been focused on 
the problem of financing higher education. 
Scholarships, both private and federal, have 
been the topic of discussion recently. Most 
of this attention has been devoted to the 
granting of scholarships after the student has 
virtually completed high school and indicated 
his desire for higher education. Very little 
attention has been given to the fact that 
family attitudes and other social and cultural 
pressures combine to keep students from 
higher education even though they have the 
ability and may have the necessary financial 
resources (Berdie, 1954). A recognition of 
these attitudinal and societal factors empha- 
sizes the need for early identification of talent 
and the counseling of talented students be- 
ginning as early as possible in their school 
career. The ninth grade appears to be a criti- 
cal time in the career of students and it is at 
this point that identification might best take 
place and counseling begin. As our store of 
knowledge about the long-range predictive va- 
lidity of tests used for identification purposes 
in the ninth grade grows, this counseling can 
become more efficient. The present study 
describes an attempt to determine the pre- 
dictive validity of tests given to students 
early in their high school careers through the 
Minnesota State-Wide Testing Program. Spe- 
cifically, these test data were correlated with 
scores on the National Merit Scholarship Cor- 
poration Screening Test. It was hoped that 
the results of this study would give counselors 
information about how students? early test 
data are related to scores on the Screening 
test. 

In 1955 the initial screening test of the 
National Merit Scholarship Corporation was 
given to 1543 high school seniors in 336 high 
schools in Minnesota. This group consisted 
of 659 boys and 884 girls. From the files of 


32 


the Student Counseling Bureau State-Wide 
Program, test scores of these persons were 
obtained and correlated with their scholar- 
ship screening test scores. In addition to the 
correlation, means and standard deviations 
were determined. 

Table 1 shows by sex and by sex combined 
the means, standard deviations, and numbet 
of those persons in the screening group who 
had taken each test administered through the 
state-wide testing program. This table also 
shows the students’ year in school during 
which the state-wide test was taken. For 
example, Table 1 shows that 178 boys and 
315 girls in the screening group had taken 
the American Council on Education Psy’ 
chological Examination (ACE) during theif 
freshman year in high school, The mean 0 
the boys’ scores on the ACE is 94.0 and thé 
standard deviation is 15.4. For the girls, the 
mean is 91.7 and the standard deviation } 
14.8. For boys and girls combined, the mea” 
of the ACE taken as freshmen is 92.5 and th? 
standard deviation is 15.0. 

The boys’ mean on the ACE correspond‘ 
to a centile rank of 95 based on Minneso" 
state-wide norms. The girls’ mean cor” 
sponds to a centile rank of 94 on the sam 
norms, and the mean for the combined $” 
group corresponds to a centile rank of 95+, i 

From Table 1 we see that the mean P8, 
school rank for boys is 91.8 and for girls |, 
94.7. This group, highly selected on Kf 
school rank, ranked on the average in 
upper 5% compared to Minnesota state 


‘dt 
norms on a ninth-grade scholastic aptit 
test. 


j 
Though the women averaged higher K 
men on high school rank, 94.7 to 91-5) aif 
men averaged significantly higher tha” f 
the women on the Merit Screening test itl 


di 
57.7 to 49.8. Men averaged higher tha? 


National Merit Scholarship Screening 


the women on all the tests studied except the 
Cooperative English Tests given at the ninth 
grade and again at the eleventh grade, and 
the Clerical part of the Differential Aptitude 
Tests (DAT). The women scored higher on 
these tests. On the Numerical Ability part 
of the DAT, the men’s and women’s averages 
Were equal. 

Table 2 shows the correlation coefficients 
between scores on the National Merit Schol- 
arship Screening Test and the scores on the 
various state-wide program tests. 

In Table 2 note that of those tests adminis- 


33 


tered at the ninth-grade level, the ACE, the 
Cooperative English Test, and the Verbal 
Reasoning section of the Differential Apti- 
tude Test Battery yield the highest correla- 
tions. Unfortunately, we had no sample of 
students who had taken the Iowa Tests of 
Educational Development (ITED) as ninth 
graders, for ITED composite scores obtained 
on later grades show a very substantial rela- 
tionship to Screening test scores. In making 
comparison among these tests, one must re- 
call that the ITED involves approximately 
eight hours of administration time whereas 


Table 1 


Means and Standard Deviations of Minnesota State-Wide Program Tests of Those Persons 
Taking the National Merit Scholarship Screening Examination in 1955 


Men Women Combined Sexes 
-n — = = Tests 
Tests* N x $d N xX SD N Š SD Takenas 
l. ACE 178 94.0 15.4 315 91.7 148 493 92.5 15.0 Fr. 
2. Eng. 119 240.3 54.1 185 259.5 49.7 304 2520 523 Fi 
3. Coop. Math. 93 93.6 16.1 151 86.8 16.2 244 894 16.5 Fr. 
4. Coop, Science 98 988 19.8 155 83.4 184 253 89.3 204 Fr. 
5. Coop. Soe. Sci. 91 108.2 18.3 151 96.5 168 242 100.9 18.3 Fr. 
6. DAT-Verbal 79 30.6 7.2 116 294 7.9 195 299 7.6 Fr. 
7. DAT-Numerical s0 269 5.2 116 270 6.7 196 269 61 Fr. 
8. DAT-Abstract 79 361 6.2 115 34.7 72 194 353 6.9 Fr. 
9. DAT-Spatial 79 644 211 116 540 18.0 195 58.2 20.0 Fr. 
10. DAT-Mechanical 79 473 104 116 329 93 195 38.7 12.1 Fr. 
11. DAT-Clerical 66 519 107 110 626 10.9 176 58.6 120 Fr. 
12, ITED 92 2.7 32 179 226 38 Soph. 
87 23.6 42 
13. ITED 41 274 40 65 255 35 106 262 38 Jr 
14. ITED 24 294 2.9 38 266 44 662 277 41 Sr. 
15. ACE 9 127.5 153 1500 129.3 156 Jr. 
z ga 131.8 15.7 85 
16. Eng, 639 174.3 226 859 183.9 19.3 1498 179.8 21.3 Tr: 
17. HSR 643 98 96 s68 94.7 6.7 1511 93.5 8.2 Jn. 
18. NMST 659 57.7 13.4 s84 49.8 11.9 1543 53.1 13.2 Sr. 


* Test identification code: 


> American Council on Education 
+ Cooperative English Test, Form 
Sooperative Mathematics Test, Fo! 
+ Cooperative Science Test, Form 
*ooperative Social Science Test, 
- Differential Aptitude Test (Form A), 
4 pigerential Antide Test korm a ; 
» pifferential Aptitude Test (Form 44), T = 
© Differential Aptitude Test (Form A), Spatial Relation’ 
- Differential Aptitude Test (Form A), 
- Differential Aptitude Test (Form A), Cl 

owa Tests of Educational Development, 
Iowa Tests of Educational Development, 
« Iowa Tests of Educational Developmen 


Psychological 
level 


DIINAN S SOON AN awe 


- National Merit Scholarship (Initial Screening) 


) Test Score. 


Examination, 1947 High School Edition 
1, single booklet edition, total score 
for Grades 7, 8, and 9 
rm Y, for Hea 
Form Y, for Grades 7, 8, and 9 


fechanical Reasoning 
è lerical Speed and Accuracy 
ite Standard Score 


ation, 1952 College Edition 


‘Effectiveness and Mechanics of Expression) 
eT h school scholastic record through the eleventh grade 


34 


Edward O. Swanson and Wilber L. Layton 


Table 2 


Correlation Coefficients of Scores on State-Wide Program Tests with Scores on 
National Merit Scholarship Screening Examination 


Men Women Combined Sexes 
Correlation Correlation Correlation Tests 
Teste N Coefficient N Coefñcient N Coeficient Taken as 
1. ACE 178 682 315 .651 493 654 Fr. 
2. Eng. 119 .698 185 -660 304 -580 Fr. 
3. Coop. Math. 93 .617 151 .569 244 609 Fr. 
4. Coop. Science 98 558 155 602 253 -633 Fr. 
5. Coop. Soc. Sci. 91 641 51 O41 242 675 Fr. 
6. DAT-Verbal 79 -704 116 -678 195 .675 Fr, 
7. DAT-Numerical 80 .350 116 .628 196 496 Fr. 
8. DAT-Abstract 79 437 115 464 194 457 Fr. 
9. DAT-Spatial 79 462 116 .262 195 .402 Fr: 
10. DAT-Mechanical 79 .501 116 ALS 195 531 Fr. 
11. DAT-Clerical 66 -160 110 137 176 103 Fr. 
12. ITED 87 742 92 -107 179 747 Soph 
13. ITED 41 822 65 -630 106 -134 Jr. 
14. ITED 24 .604 38 .807 62 -782 Sr. 
15. ACE 641 682 859 .667 1500 678 Jr. 
16. Eng. 639 581 859 568 1498 .471 Jr. 
17. HSR 643 -268 868 146 1511 148 Jr. 


è See identification code for Table 1. 


the other tests require only from 25 to 80 
minutes. The Verbal Reasoning test of the 
Differential Aptitude Test Battery appears to 
do an excellent job of predicting, considering 
that it is a 30-minute test. 

As one would expect, the college edition of 
the ACE Psychological Examination which 
was given during the junior year correlated 
quite substantially with the screening test 
taken a year later. Surprisingly, the Co- 
operative English Test given during the junior 
year did not correlate as well as the ninth- 
grade test, although this may be due to the 
incidental selection factor. Since it is an 
achievement test and the students were 
largely selected on the basis of high school 
performance for eligibility to take the screen- 
ing test, there probably was a restriction in 
range on the English test which could help 
account for the lower correlation. However, 
all test distributions probably are curtailed 
by the selection factor. It is obvious that 
high school rank being an explicit selection 


factor accounts for the low relationship be 
tween high school rank and score on thi 
screening test. 4 

This study has demonstrated that there ! 
a considerable amount of predictive validit! 
in the nationally available tests used in th 
Minnesota State-Wide Testing Program fo 
predicting standing on the National Met’ 
Scholarship Screening Test. There appe?" 
to be enough predictive validity so that the 
instruments can be used as early as the nint’, 
grade level, as well as late, to identify t*, 
ented students in terms of their later p% 


tion on the National Merit ni 


Scholars j 
Screening Test. These students and th 


soh? 
parents can then be urged to consider nish 
education as one alternative in planning © 
cational and vocational futures. 


Received March 31, 1958. 


Reference 


li 

e io” 

Berdie, R. F. After high school—what? M 
apolis: Univer. Minnesota Press, 1954, 


SE e 
ed 


Journal A 
Vol. 43, aA aplted Psychology 


SEX AS A DETERMINANT OF DRIVING SKILLS: 
WOMEN DRIVERS! 


LEONARD UHR 


University of Michigan 


ner yt oe experiment is an interesting 
trolled, sett gerd of simple, yet con- 
life shet ical testing in a complex real- 
T determine correlates of driving 
ger, Gadil C relatively unsuccessful (Con- 
1957: Gp , Glad, Rainey, Sanrey, & Turrel, 
1955) oe 1954; Lauer, 1955; Miller, 
ability to ¢ is may be due to the driver's 
for hi ompensate, in the usual situation, 

s deficiencies, 
moa ent of the motor scooter (an un- 
With a s unprepossessing, but fast machine), 
drive R law licensing 14-year-olds to 
tonted ai es of less than 5 horsepower, con- 
unusual a = Michigan auto driver with an 
tively wake _Stressful but (to the auto) rela- 
Were ae situation, in which novel decisions 
ONcomin ed as to speed and size of _ the 
Sere A controlled and “blind” 
Honiana a test of the effects of this con- 

i ror was set up and run as follows. 

anger, ito was first judged to be behaving 

ously toward a motor scooter (by cut- 


Table 1 


Theig 
ent -a 
Si Dangerous and Safe Auto Driving, by Sex 


Ailg Driver's Auto Driver’s Behavior — 


ex 
Dangerous Safe 
6 22 
19 3 
25 25 
0.8 (significant beyond the 0.00001 leve); 


is represents a perfect Contingency 


35 


ting across or into the scooter’s path from a 
stop street or alley so that the scooter driver 
was forced to brake or swerve his vehicle). 
Only after this judgment was made, the sex 
of the auto driver was determined. If sex 
could be ascertained at the time the judgment 
was made, the incident was not counted. 
Twenty-five such incidents were accumulated, 
along with 25 comparison incidents, the first 
matching situation subsequently observed in 
which the auto driver was judged to be acting 
safely toward a scooter (by remaining stopped 
until the scooter passed), and only then iden- 
tified as to sex. 
The results may be seen in Table 1. 


Summary 


An auto driver’s behavior was judged either 
dangerous or safe in an unusual, stressful, 
but relatively safe situation. This behavior 
was found to be related to the driver’s sex 


at the 0.00001 level of confidence. 


Received April 16, 1958. 


References 


Conger, J. J» Gaskill, H. S., Glad, D. D., Rainey, 
R. V, Sanrey, W. Ls & Turrel, E. S. Personal 
and interpersonal factors in motor vehicle acci- 
dents. Amer. J. Psychiat., 1957, 113, 1069-1075. 

Granier, V. Réflexions sur Texamen des conducteurs 
de véhicules ou d'engins de levage. Bull. Cent. 
Étud. Rech. psychotech., 1954, 3, 13-21. 

Lauer, A. R. Comparison of group paper-and- 
pencil tests with certain psychophysical tests for 
measuring driver aptitude of army personnel. J. 
appl. Psychol., 1955, 39, 318-321. 

Miller, Carmen. A comparison of high-accident and 
low-accident bus and street car operators. J: 


proj. Techa 1955, 19, 146-151. 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


USE OF THE GENERAL APTITUDE TEST BATTERY 
TO DETERMINE APTITUDE CHANGES WITH 
AGE AND TO PREDICT JOB 
PERFORMANCE * 


MICHAEL HIRT 
Walter Reed Army Medical Center 


Census data (U. S. Department of Com- 
merce, Bureau of the Census, 1952) since 
the turn of the century have indicated a con- 
sistent increase in the number of older people 
in our population. This growth has been both 
relative and absolute, and 1955 data indicate 
that approximately 10% of our population 
is over 65 years old and one third is over 45 
years old. The consequences of such popula- 
tion changes have serious social, economic, 
and psychological implications. 

Many of the technological changes which 
have increased man’s life span have simul- 
taneously brought about such industrial 
changes as to shorten and jeopardize the span 
of working years permitted to the majority 
of our employed population. In view of 
our society’s implicit philosophy that man’s 
usefulness ceases when he stops working, the 
employment problems encountered by our 
increasing aged population are of Paramount 
concern. 

The United States Employment Service 
(USES) is the federal agency whose primary 
responsibility is the procurement of job op- 
portunities for all desiring to work. The 
results of numerous studies (U. S. Depart- 
ment of Labor, Bureau of Employment Secu- 
rity: 1950a, 1950b, 1956; U. S. Department 
of Labor, Bureau of Labor Statistics, 1946) 
conducted by this agency have indicated con- 
clusively that men and women, upon reaching 
their 35th birthday, may expect to encounter 
excessive difficulties in securing employment. 
Although the age at which an applicant is 
considered old varies among companies, the 
vast majority of our working population is 
discriminated against by the time they are 
45 years old. 

1 This study is based upon a dissertation submitted 
to the graduate school at the University of Nebraska, 


in partial fulfillment of the requirements for the 
degree of Doctor of Philosophy. 


36 


In assessing their applicants, the USES 
relies heavily upon the General Aptitude Test 
Battery (GATB). This is a multifactor test 
(Dvorak, 1956) which yields scores in the 
following nine areas: Intelligence (G), Verbal 
Aptitude (V), Numerical Aptitude (N), Spa- 
tial Aptitude (S), Form Perception (P), 
Clerical Perception (Q), Motor Coordination 
(K), Finger Dexterity (F), Manual Dex- 
terity (M). Since the GATB is of such 
consequence in the employment of USES ap- 
plicants, extensive analysis of this instrument 
is justified. Specifically, the purpose of this 
study was to yield data on the following: 


1. Is the relationship between age and apti- 


tudes in the form of a straight line or a 
curve? 


2. What is the relationship between these 
aptitudes and job performance? 

3. Which combination of aptitudes and age 
can best explain the variation in job per- 
formance? 


Method 


The sample used in the study was selected from 4 
“population” of approximately 1500 Ss. The ages of 
the Ss ranged from 19 to 83 years; their educational 
level ranged from 6 to 14 years. This population 
represented 16 occupations which, in terms of the 
Dictionary of Occupational Titles structure, were 
distributed within the 8 and 9 groups. In other 
words, all these occupations are of an unskilled of 
semi-skilled nature. All the Ss were experienced 
workers and were evaluated with a descriptive rating 
scale developed by the USES in accordance with 
their procedure for developing test batteries for 
employee selection (U. S. Department of Labor 
Bureau of Employment Security, 1952). 

The population was divided into four age groups: 
25 to 34, 35 to 44, 45 to 54, and 55 and older: 
One hundred Ss were randomly selected from each 
of these age groups, yielding a total sample ° 
400 Ss. 

Each of the aptitudes as well as supervisory rating® 
were plotted against age. The test for nonlinearitY 
which was used consists of “comparing the sum ° 


a ee — “ee 
a Ee ail 


General Aptitude Test Battery 37 


s å 

tes for linear regression with the sum of squares 
a s regression” (Wert, Neidt, & Ahmann, 

; p e results of thi ysi i 

in Table 1. is analysis are summarized 

T ee 

an a Tefenigation of the most efñcient combina- 
supervi aptitudes and age to predict the criterion 

multi eet ratings) was based on the analysis of 

ited Gane This multiple regression con- 

singly, only of those aptitudes which, when consid red 

» correlated significantly with the criterion. 


Results 


— baat significance of the F test 
rable ated with the advantage of the quad- 
wheth, Over the linear regression indicates 
age ie the particular aptitude is related to 
index h linear or nonlinear manner, The 
relation correlation (R,) indicates the cor- 
zero ord resulting from the curve, and the 
tionshi er correlation (rxy) indicates the rela- 
signifie. between each aptitude and age. The 
eae of R, is indicated by the F test 
It ated with the quadratic equation. — 
= be seen from Table 1 that Aptitude 
curvilj N, and S are related to age in a 
site ae The index of cor- 
e ait for each of the aptitudes as well as 
zero iterion is statistically significant; all the 
as ae correlations between the aptitudes 
een” as the criterion with age are signifi- 
Aptit and negative, with the exception of 
ee K and M, which are related in a 
Ive and significant manner. 


Table 2 
Changes in Aptitudes G, V, N, and S at Various Ages 
Aptitudes 

Age G g N S 

20 83.848 90.645 82.688 90.216 
25 86.572 91.792 83.940 90.722 
30 88.337 92.299 84.524 90.793 
31 88.575 92.324 84.560 90.756 
32 88.775 92.324 84.570 90.700 
33 88.936 92.297 84.554 90.627 
35 89.144 92.168 84.440 90.429 
37 89.199 91.396 84.219 90.161 
40 88.994 91.396 83.688 89.629 
45 87.885 89.986 82.269 88.394 
50 85.818 87.937 80.182 86.724 
55 82.794 85.248 77.427 84.618 
60 78.812 81.920 74.005 82.077 
70 67.973 73.345 65.157 75.689 


Table 2 indicates the changes which occur 
in Aptitudes G, V, N, and S at various ages. 

Tt can be seen from Table 2 that these four 
aptitudes reach their peak at ages 37, sil 
32, and 30, respectively, and then begin to 
decline. 

When the zero order correlations between 
the criterion and the aptitudes as well as age 
were computed, it was found that Aptitudes 
P, Q, K, M, and age correlated at the .01 
level with the criterion. Table 3 indicates 


Table 3 


Aig pesa i i i ion of the Criterio! 
antage of the Quadratic over Linear Regression Analysis of Multiple Regression © rion 
for the General Aptitude Test Battery 
and the Criterion Source of Variation Ri F 
F Value Five variable regression 337 8.970* 
APtitudes F (Aptitudes P, Q, K, M. 
Rı ray Quadratic Advantage and age) 
z 266 —.217** 15.106** 10.035** Four variable NTA .320 .740 
V 266 —.245** 15.164** 4.669" (Aptitudes Q, K, M, 
N a32 a 11.298 3.664" and age) ; oe ee 
i 592 —.560** 107.358** 27.799** Three variable regression . 
5 487 —.487** 61.791** pod ee K, M, and 
s os ** . .66 age. 
=) 266 aa ure 761 Two variable regression .250 16.141* 
M 370 —.367** 31.577** 10 (Aptitude M and age) m ome 
Criterion 309 .303** 20.962** at Two variable regression E J 
+191 —,189* 11.373** : (Aptitude K and age) 
Di Significant at the .05 level. F Significant at the .01 level. 


nificant at the .01 level. 


38 


the results of the analysis of multiple regres- 
sion between ‘these five variables and the 
criterion. 

It can be seen from Table 3 that only the 
elimination of one of the variables, Aptitude 
K, results in a significantly lower multiple 
correlation. In other words, this is the only 
variable which contributes significantly in the 
prediction of the criterion. 


Discussion 


Several significant points emerge from the 
results of this study. Outstanding among 
these is the curvilinear relationship between 
age and Aptitudes G, V, N, and S. The con- 
sequences of such a relationship are not ex- 
plicity clear. In general, when a given pre- 
dictor (age in this particular case) correlates 
negatively with the other valid predictors and 
zero with the criterion, that variable con- 
tributes unwanted variance and thereby de- 
tracts from the efficiency of the prediction 
battery. 

In this particular case, since only Aptitude 
K contributes significantly to explaining the 
variance in the criterion, and this is one of 
the two variables which is not correlated with 
age in a negative manner, age does not seem 
to penalize the work evaluation scores (cri- 
terion) of this particular sample. 

There are at least two considerations which 
are necessary in interpreting the results of 
this study. In the first place, the criterion 
may be inadequate. This is a risk inherent 
in practically all applied research. In the 
second place, and this seems more reasonable, 
the results may reflect the sample used. It 
will be recalled that the occupations repre- 
sented were all in the unskilled level; it is 
quite likely that for such occupations motor 
coordination is indeed the major or only apti- 
tude measured by the GATB which is needed 
for successful job performance. This possi- 
bility should be considered in interpreting the 
results from the analysis of multiple regres- 
sion. With regard to the curvilinear relation- 
ship between age and four of the aptitudes 
measured by the GATB, this finding alone 
justifies considering age as a relevant variable 


when using the GATB to predict job per- 
formance. 


Michael Hirt 


If, in repetitions of this study with samples 
drawn from different occupational groups and 
with the development of more adequate cri- 
teria, it can be demonstrated that age cor- 
relates zero with the criterion and negatively 
with the valid predictors, a correction factor 
will have to be introduced to compensate for 
age detriments. 


Summary 


This study attempted to answer the fol- 
lowing questions: 


1. Is the relationship between age and ap- 
titudes as measured by the GATB in the form 
of a straight line or a curve? 

2. What is the relationship between these 
aptitudes and job performance? 

3. Which combination of aptitudes and age 
can best predict the variance in the criterion? 


The sample used consisted of 400 Ss 
equally divided into age groups ranging from 
25 to 34, 35 to 44, 45 to 54, and 55 and older. 
The GATB scores as well as job performance 
evaluations were related to age by means of 
nonlinear regression. It was found that Apti- 
tudes G, V, N, and S were related to age 
in a curvilinear manner, reaching their peak 
at ages 37, 31, 32 and 30, respectively, and 
then beginning to decline. 

When the best prediction scheme of the 
criterion was sought, it was found that only 
Aptitude K contributed significantly to pre 
dicting the variance in the criterion. The 
possibility was suggested that this may Þe 
an artifact of the sample used. 


Received April 25, 1958. 


References 


Dvorak, B. J. The General Aptitude Test Battery” 
Personn. Guid. J., 1956, 3, 145-154. 


U. S. Department of Commerce, Bureau of tbi s 


Census. A report of the seventeenth decenni" 
census of the United States: Census of population’ 
1950. Washington: U. S. Government Printi”? 
Office, 1952. t 
U. S. Department of Labor, Bureau of Employme) 
Security. Older workers at the public employme” 


office. Washington: U. S. Department of Labo” 
1950. (a) 


s— i 
JAO 


General Aptitude Test Battery 39 


U. S. Department of Labor, Bureau of Employment 
Security. The older worker in the labor market. 
Washington: U. S. Government Printing Office, 
1950. (b) 

U. S. Department of Labor, Bureau of Employment 
Security. Test development guide. Washington: 

we S. Department of Labor, 1952. 
aes Department of Labor, Bureau of Employment 
Security. Older worker adjustment to labor mar- 
ket practices: An analysis of experiences in seven 


major labor markets. Washington: U. S. Govern- 
ment Printing Office, 1956. 

U. S. Department of Labor, Bureau of Labor Statis- 
tics. Census of discrimination against older work- 
ers. Monthly Labor Rev., 1946. 

Wert, J. E., Neidt, C. O., & Ahmann, J. S. Statis- 

methods in educational and psychological 


tical 
New York: Appleton-Century-Croits. 


research. 
1954. 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


PREDICTION OF AN ULTIMATE CRITERION 
OF SUCCESS AS A LAWYER: 


M. K. DISTEFANO, JR.” ano BERNARD M. BASS 


Louisiana State University 


The Law School Admission Test (LSAT), 
constructed and administered by Educational 
Testing Service, is designed to predict scholas- 
tic achievement in law school. Performance 
on this test is said to depend upon the “ability 
to read, to understand, and to reason logically 
with a variety of verbal, quantitative, and 
symbolic materials” (Educational Testing 
Service, 1954a, p. 3). The test-retest re- 
liabilities of the LSAT (short form) range 
from .67 to .87, with a median 7 of .77 (John- 
son, Olsen, & Winterbottom, 1955). 

The need for a predictor other than prelaw 
grades or in addition to grades is evidenced 
by the inconsistency of grading practices in 
different colleges and departments (Bass, 

1951). The Educational Testing Service in- 
vestigated the effectiveness of the LSAT in 
combination with prelaw grades in predicting 
academic success (Johnson, 1954; Johnson et 
al., 1955; Educational Testing Service, Va- 
lidity Studies Section, 1954). In these stud- 
ies both the long and short forms were used, 
The latter was found to Predict as well as the 
former, if not better. In general the findings 
showed the LSAT to be a better predictor 
than prelaw grades, but that both combined 
were better predictors than either considered 
individually. Application of these findings 
has resulted in an appreciable reduction in the 


Percentage of law school failures hi 
et al., 1955), iaaii 


and .25 with low grades and 
. Multiple correlations with the 
immediate and intermediate criteria were .49 and 41, 


respectively. Results were thus consistent with the 
findings of ETS. 


40 


The present study attempted to predict an 
ultimate criterion of success as a lawyer with 
the LSAT and prelaw grades. 


Ultimate Criterion of Success 
as a Lawyer 


The ultimate criterion was the demon- 
strated legal ability of examinees after five 
years out of law school as rated by court 
judges living in the area in which the lawyers 
practiced. 

Preliminary interviews with lawyers, both 
academic and nonacademic, agreed with the 
ETS conclusion that the ultimate criteria of 
success as a lawyer were fuzzy. Yet, some 
degree of concensus was found in the evalua- 
tion of specific individuals by judges. The 
sample of 17 Practicing lawyers came from 
two urban areas in the State; they had been 
graduated from the LSU law school in 1948 
through 1950, They had been administered 
the LSAT on entrance to law school. In the 
first area, three district judges were asked to 
rate 10 lawyers in their district on the basis 


of their impressions of the lawyer's legal 
ability. They were instructed to rate them 


on a five-point continuum, with a rating of 
1 being “very low in legal ability” and 5 
being “very high in legal ability.” They were 
further told to specify if a lawyer was “not 
known” to them. (This eliminated one S.) 

In the second area, seven practicing lawyers 
were rated by two district judges. The agree- 
ment among the three judges in the first 
urban group yielded interrater reliabilities 0f 
«82, .82, and .97, However, the correlation 


between the two raters in the second urba” 
area was only .42. 


Results and Conclusions 


The bimodality and nonsymmetry of thé 
criterion ratings suggested the use of a nod 
parametric analysis, The first urban sample 
appeared to separate naturally into five 
“highs” in rated legal ability and four “lows 


Criterion of Success as a Lawyer 41 


The second bimodal sample separated natu- 
rally into five “highs” and two “lows.” These 
16 lawyers were ranked on the LSAT scores 
they had earned upon admission to law 
school from first to sixteenth. The 10 “highs” 
according to judges’ ratings had a mean rank 
of 7.8 on LSAT scores. The 6 “lows” had a 
mean rank of 11.6. According to White’s 
test (Edwards, 1954), the difference was sig- 
nificant at the 5% level (T' = 35). Corre- 
Sponding mean ranks of the rated “highs” 
and “lows” on prelaw grades were 6.9 and 
13.4, Again, the difference was significant 
at the 5% level (T’ = 44). A ranking based 
on an optimum weighting of LSAT scores and 
Prelaw grades (optimum in forecasting law 
Schoo] grades) yielded a mean rank of 7.2 
for the “highs” and 12.8 for the “lows” with 
a T' of 38 significant at the 5% level. 


Summary 


District court judges’ ratings of 16 lawyers 
Practicing in their areas for five years pro- 
es two ultimate criterion groups. The 

ighly judged lawyers were found to have 
Scored significantly higher on the Law School 

dmission Test taken upon entrance to law 
‘chool, Their prelaw grades were also found 
© have been significantly higher. 
th hese results were obtained despite the fact 
at the test was constructed without refer- 


ence to postschool law success; and despite 
the ambiguities in evaluating ultimate suc- 
cess as a lawyer, and the restrictions in size 
and range of the sample due to attrition in 
law school and change of career following 
graduation. 


Received April 24, 1958. 
References 


Bass, B. M. Intra-university variations in grading 
practices. J. educ. Psychol., 1951, 42, 336-338. 
Educational Testing Service. Law School Admission 


Test. Statistical summary by undergraduate col- 
leges attended 1948-1952. Princeton: Author, 
1952. 


Educational Testing Service. Law School Admission 
Test. Bulletin of information 1954-55. Princeton: 
Author, 1954. (a) 

Educational Testing Service. Law School Admission 
Test. Statistical summary by undergraduate col- 
leges attended 1948-1954. Princeton: Author, 
1954. (b) 

Educational Testing Service, Validity Studies Sec- 
tion. The Law School Admission Test as a pre- 
dictor of law school grades. 1952-53. Princeton 
and Los Angeles: Author, 1954. 

Edwards, A. L. Statistical methods for the behav- 
joral sciences. New York: Rinehart, 1954. 

Johnson, A. P. Effectiveness of the Law School Ad- 
mission Test in predicting scholastic success in law 
school. Princeton: Educational Testing Service, 


1954. 
Johnson, 
J. As 
gestions for its use. 
ing Service, 1955. 


A. P., Olsen, Marjorie A., & Winterbottom, 
The Law School Admission Test and sug- 
Princeton: Educational Test- 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


SELECTION OF SUBPROFESSIONAL HOSPITAL 
CARE PERSONNEL’ 


NORMAN CLIFF, SIDNEY H. NEWMAN, axb MARGARET A. HOWELL 


U. S. Public Health Service, Washington, D. C. 


According to Hospitals (Anonymous, 1956) 
there are over 400,000 subprofessional hos- 
pital care employees in the United States. 
Groups of applicants for these subprofessional 
positions tend to contain a large proportion 
of individuals of low caliber. Consequently, 
some means is needed whereby applicants 
who are suitable for this type of work can 
be selected. 

Only a few studies of methods for selecting 
and evaluating these necessary personnel are 
reported in the literature; for the most part, 
those that are reported concern the selection 
of psychiatric aides. The evidence presented 
by Levine (1951), Barron and Donohue 
(1951), and Yerbury, Holzburg, and Allessi 
(1951) indicates that aptitude or ability 
tests may contribute in some degree to the 
selection of subprofessional hospital care 
personnel. 

Studies using personality measures for se- 
lection report varying degrees of success in 
predicting job performance. Kline (1950), 
Yerbury, Holzburg, and Allessi (1951), and 
Love (1955) Teport at least some degree of 
Success using personality measures of various 
types while Levine (1951) and Caudra and 
Reed (1957) report negative findings. 

It is clear that studies are needed which 
will furnish more definitive evidence on the 
validity of aptitude or ability tests and per- 
sonality tests in the selection of subprofes- 
sional hospital care Personnel, especially kinds 
of personnel other than Psychiatric aides, 


Subjects 


The subject group consisted of 150 incum- 
bents at two U. S. Public Health Service 
hospitals. The distribution of the group by 
grade and specialty is given in Table 1. 


1 The authors wish to express their appreciation 
to Milton Epstein, formerly with the Public Health 
Service, now at Davis Memorial Goodwill Indus- 


tries, for assistance in the preparation of the tests 
and collection of data, 


42 


Nursing Assistants at Grades 1 and 2 are 
mainly concerned with sanitary and custodial 
care of patients while those at Grades 3 and 4 
are more nearly like, and often actually are; 
registered practical nurses. Employees in the 
Medicine and Surgery specialty are concerned 
with care of patients in medical and surgical 
wards; those in the Psychiatric group are 
psychiatric attendants; those in the Operating 
Room specialty perform duties involving 
preparation of patients for surgery, care ol 
operating room equipment, and scrub nurs? 
functions. 


Method 


Tests and administration. The nine test 
whose names are given in Table 2 were st’ 
lected by means of Primoff’s J Scale tech 
nique and special forms of the tests wer? 
developed (Primoff, 1957).2 The administra 
tion of the tests, which was carried out during 
regular working hours, was found to be diff 
cult due to the low ability level of the Ss an 
their unfamiliarity with objective tests. í 
number of zero scores were found on all 0 
the tests. 

Criterion measure. The criterion used W% 
a set of 18 12-point scales specifically de 
signed to measure the important aspects 0 
performance of the Nursing Assistant. Thes? 
scales are listed in Table 2. Each scale ha , 
four behaviorally defined levels of perform 
ance with three points within each level. TP 
criterion score of an individual on each sc@ 
was the average of the confidential ratin? 
given him by five professional nurses. 


e 

? The work of Ernest Primoff and other membe”, 
of the Test Development Section, U. S. Civil Serve 

ommission, in the selection and development 0} gp 
tests is gratefully acknowledged. A Federal int®, 
agency committee cooperated in the test devel 
ment project. rs? 

3 We wish to acknowledge the aid of Senior NU Ve 
Officer Mary Jenney, who collaborated in the d! yf 
opment of the tests, did the professional nurse wed 
essential to the development of the rating scales; 
made other contributions to the study. 


Selection of Subprojessional Hospital Care Personnel 43 


Table 1 


Distribution of Nursing Assistants by 
Specialty and Grade 


GS Grade Level 


Specialty 1 2 3 4 Total 
Medicine and Surgery 24 71 #17 ü 112 
Psychiatric 0 13 7 6 26 
Operating Room o 4 8 0 12 
Total 24 88 32 6 150 


Statistical analyses. Three statistical analy- 
ses were made. In one, all intercorrelations 
of scores for the total group on the 27 vari- 
ables were computed and analyzed by Gutt- 


man’s multiple group method of factoring, 
all factors being extracted at once and or- 
thogonalized (Guttman, 1952). Minor or- 
thogonal rotations were made to clarify the 
structure. In addition, less extensive analy- 
ses were made of the scores of two more 
homogeneous groups. For one subgroup, con- 
sisting of 95 Ss at the GS-1 or 2 level in the 
Medicine and Surgery specialty, test validi- 
ties in predicting Scales 1 and 8 (see Table 
2) and the intercorrelations of the tests were 
computed. For the other subgroup, consist- 
ing of 26 Psychiatric Nursing Assistants, only 
Scale 1 validities were computed. 


Results and Interpretation 


Intercorrelation matrix. From the com- 
plete matrix of intercorrelations for the total 


Table 2 
Factor Loadings—Criterion Ratings and Selection Tests 
Criterion Ratings I I m 
1 Dexterity and Adaptability in Handling Equipment a e se 
2 Aw f Patients’ Needs : j . 
3 Reena ine Safety Hazards and Adherence to son -0B ae 
Aseptic Techniai + Procedures 345 —.056 ‘406 
å Accuracy in Carrying out Procec P pi 
S Accuracy in Making and Reporting Observations e nee ee 
6 Organization of Work 725 ‘075 “442 
7 Checking of Inventories ; “985 ‘029 343 
8 General Fitness as 4 Nursing Assistant "804 ‘019 "294 
9 Emotional Control . 819 131 035 
10 Accepting Changes in Assignment “909 ‘082 056 
11 Acceptance of Criticism 854 036 035 
12 Relationships with Co-workers , 837 073 078 
13 Discretion Exercised When Speaking 881 —.024 038 
14 Dependability in Carrying out Assignments ae “da ag 
15 Standards of Cleanliness and Order 837 ‘042 — 041 
16 Adherence to Hospital Policies and Regulations “683 ‘992 ats 
17 Attendance and Promptness 706 082 —.062 
18 Appearance 
Tests 045 597 397 
19 Gross Dexterity i n O11 576 361 
20 Background for Handling Nursing Equipment® 070 676 436 
21 Nursing Perception —.003 740 465 
22 Number Checking .009 .706 419 
23 Coding Memory —.007 .686 449 
24 Verbal Ability —.023 595 356 
25 Fine Dexterity 000 -133 .326 
26 Arithmetical Ability _ 044 669 435 
27 Ability to Follow Oral Directions 


a A test of mechanical comprehension usin 


g hospital equipment in the items. 


44 Norman Clif, Sidney H. Newman, and Margaret A. Howell 


group is was seen that the degree of relation- 
ship between test scores and criterion scales 
was rather small, but that some of the scales 
correlated more highly with the tests than 
others.* All the scales intercorrelated more 
highly among themselves than they inter- 
correlated with the tests, and all the tests 
intercorrelated highly. 

Factor analysis. As can be seen from the 
results of the factor analysis in Table 2, 
there are two clearly defined factors, per- 
formance ratings (Factor I) and ability tests 
(Factor II), and a fairly definite third, the 
factor common to both the ratings and tests 
(Factor III). All the tests have moderate 
and approximately equal loadings on Factor 
III, as do about half of the criterion scales, 
while the other half of the scales have zero 
or negative loadings. 

The relatively high intercorrelations among 
the tests and their high loadings on Factor 
II, the ability test factor, indicate that the 
tests are not measuring in any great degree 
the specific abilities they were designed to 
measure, but are rather measuring a general 
ability for the most part. 

The differentiation between the predictable 
and unpredictable criterion scales does not 
make for a completely clear-cut interpreta- 
tion of Factor III, but the scales with high 
loadings seem to be those which deal with 
the Nursing Assistant’s ability to perform 
tasks while the low loadings are those of 
scales relating to aspects of the job more com- 
pletely dependent on personality and social 
factors. The highest loading on Factor III 
is that of Scale 1, Dexterity and Adaptability. 
The other scales with loadings over .40 on 
this factor, Scales 2-7, also have to do pri- 
marily with task performance aspects of the 
job. On the other hand, Scales 10-18, such 
as Acceptance of Criticism, Relationships with 
Co-Workers, and Adherence to Hospital Poli- 
cies and Regulations, which reflect person- 

*The complete matrix o 
total group has been depos 
5807 with the ADI Auxil 
Photoduplication Service; 
Washington 25, D. C. Cop 
citing the Document number and by remitting $1.25 
for photoprints, or $1.25 for 35 mm. microfilm. 


Advance payment is required. Make checks or 


money orders payable to: Chief, Photoduplication 
Service, Library of Congress. 


f intercorrelations for the 
ited as Document number 
iary Publications Project, 

Library of Congress, 
ies may be secured by 


ality, work attitudes, and social behavior, 
have zero or negative loadings. General Fit | 
ness, Scale 8, has a positive loading, but is 
lower than Scales 1-7. The only important 
contradiction to the differentiation of scales 
is provided by Scale 9, Emotional Control, 
which has a moderate positive loading of .29. 

The differential predictability of the rating 
scales is somewhat surprising in view of the 
high correlation observed among all the scales: 
This differential predictability indicates that 
the degree of validity evidenced by predictors 
of the ability type depends to a considerable 
extent on the degree to which the criterion is 
composed of task performance aspects of the 
job. Presumably, success of prediction would 
be greatly enhanced if appropriate measures 
of the relevant personality variables, as re- 
flected in the scales having loadings around 
zero on Factor III, could be devised. 

Validity coefficients. More specific infor- 
mation on the test validities is furnished i? 
Table 3. This table gives the validity coeff 
cients based on Dexterity and Adaptability; 
Scale 1, the scale which had the highest load- 
ing on Factor III, and on General Fitness; 
Scale 8, for both the total group and the 
Medicine and Surgery subgroup. Only va" 
lidities based on Scale 1 are given for the 
small Psychiatric subgroup. 

Validities for the total group in predicting 
over-all General Fitness, Scale 8, range fro™ 
-11 to .20; the four highest of these coeffi 
cients are significant at the .05 level, Va 
lidities for Dexterity and Adaptability, Scalé 
1, are somewhat higher, ranging from .19 t 
30; two of these are significant at the .0 
level, and the rest, at the .01 level. The rela’ 
tive predictability of these two scales is con” 
sistent with the results of the factor analysis 

The validities of the Medicine and Surge!) 
subgroup are about the same as those for the 
total group, which is not surprising since this 
Category constitutes about 60% of the total 
In this subgroup as well as in the total group’ 
multiple correlational analysis indicated that 
the increase in prediction from the use ° 
more than one predictor was of neither prac 
tical nor statistical significance. i 

The test validities in the small Psychiatti® 
subgroup ranged from .10 for Arithmetic 


Selection of Subprofessional Hospital Care Personnel 45 


Table 3 
Validity Coefficients by Subgroup and Performance Rating Scale 


Psychiatric Medicine and Surgery 
Subgroup Subgroup Total 
p Scale 1 Scale8g Scale 1 Scale8 Scal 
: Test V= 26 N=95 N=95 N = 150 We Sh 
toss Dexterity* see 
Ba AS At -20* .19* ig 
pee for Handling Nursing ‘ 
Equil 
‘Sars pment; .26 12 18 4 .20* 
N ng Perception Id 04 iF ell 26** 
Number Checking 37 18 27 is 28" 
Ve 4 Memory 38 13 .29** aur 24" 
ee Ability .14 12 .22* 14 ai" 
acer, 7 13 .08 .24* .12 PK hs 
Abilit ical Ability -10 13 .24* 14 j19% 
y to Follow Oral Directions as A3 xar .20* 20 


in ppo Scat mar 
i the N plots indicated that the correlations fo 
ay ni he zero scores, of which there were an 


Ot r du 
* p Heck the degree to which manual dexterity is invol 


1 
*P <i 


Abili 
Mid i 45 for Gross Dexterity. The 
test in a validity for the Gross Dexterity 
that Ss is group is due primarily to the fact 
nine o who received zero scores On this test, 
ince a of 26, tended to have low ratings. 
due to e zero scores are almost sure to be 
tions a failure to understand or follow direc- 
exte rather than a complete lack of manual 
rect the test of Ability to Follow Oral 
ity a probably best represents the abil- 
E eE is related to job performance. Also, 
e te ion of the distribution of scores for all 
bites led to the conclusion that the test of 
Most y to Follow Oral Directions was of the 
chia o puat difficulty level for the Psy- 
in, Ic as well as the other groups of Nurs- 
E Assistants, 
ioe usefulness of selection tests. 
ion practical usefulness of the Oral Direc- 
priat test, which is deemed the most appro- 
trated for Nursing Assistants, can be illus- 
grou by dividing the Medicine and Surgery 
—2UP into quintiles and the Psychiatric group 


5 

the The apparent inconsistency in 
in abl, ities for the Psychiatric subgroup 
Direction 4 (where the correlation of .35 fi 

18 not) ns is significant, but 38 for Coding Memory 
gree of due to the fact that the signific 
Median. association was established from | 
Test. -split contingency table using Fisher’s Exact 


tests would have been about the same as shown 


r the Gross and Fine Dexterity 
undue number, were eliminated; the validities of the dexterity tests, however, 
ved in the job. 


at the median on performance as reflected by 
Scale 1 and comparing the test score dis- 
tributions of the upper and lower performing 
groups. In the Medicine and Surgery sub- 
group, 74% of the lowest fifth on the test are 
low performers while only 26% are high per- 
formers. These figures are nearly reversed 
for the highest fifth on the test; 28% of this 
group are low performers while 72% are high 
performers. For the Psychiatric subgroup, 
69% of those above the median on the test 
are high performers and 31% are low, while 
the group below the median on this test is 
composed of 23% high performers and 171% 

Thus it appears that the av- 


low performers. 
erage level of performance could be signifi- 


cantly raised through the use of this test. 


Summary and Conclusions 


Nine ability tests developed by the United 
States Civil Service Commission as a coopera- 


tive project seem to predict the work perform- 
ance of Nursing Assistants in the U. S. Public 
Health Service to some degree. The tests are 
of use in screening out applicants of too low 
an ability level to adapt successfully to the 
task performance aspects of the work. The 
validity of the predictors appears to be pri- 


46 


marily due to a general ability rather than 
abilities specific to individual tests. The test 
of Ability to Follow Oral Directions is per- 
haps most appropriate because of the satis- 
factory distribution of scores it gives with a 
group of this ability level and because it prob- 
ably measures best the ability which appears 
to be producing the validity. 

The 18 criterion scales measure a general 
performance factor, Factor I. In addition 
about half of the scales, those measuring the 
task performance aspects of the work, com- 
bine with the tests to constitute Factor III. 
The ability to perform the tasks involved in 
the work, then, is predictable to some degree. 
The aspects of the work represented in those 
scales with low or negative loadings on Fac- 
tor III, which appear to deal with person- 
ality, social, and work attitude characteristics, 
are not predictable by the tests employed in 
the study. To measure those characteristics, 
other tests would have to be constructed. 


Received April 28, 1958. 


Norman Cliff, Sidney H. Newman, and Margaret A. Howell 


References 


Anonymous. Nursing personnel employed in hos- 
pitals. Hospitals, 1956, 30, 66-71. $ 
Barron, E. M., & Donohue, H. H. Psychiatric aide 
selection through psychological examination, Amer 
J. Psychiat., 1951, 107, 859-865. 3 
Caudra, C. A, & Reed, C. F. Prediction of psychi- 
atric aide performance. J. appl. Psychol, 195% 

41, 195-197. 

Guttman, L. Multiple group methods for common- 
factor analysis: Their basis, computation, and in- 
terpretation. Psychometrika, 1952, 17, 209-222. 

Kline, N. S. Characteristics and screening of unsatis: 
factory psychiatric attendant-applicants. Amer. J: 
Psychiat., 1950, 106, 573-586. 

Levine, S. The relationship between personality and 
efficiency in various hospital occupations. Unpub- 
lished doctoral dissertation, New York Univer: 
1951. 

Love, J. O. Educational background and job ad 
justment of private hospital Psychiatric aides 
Amer. J. Psychiat., 1955, 112, 186-189. 

Primoff, E. S. The J-Coefficient approach to job’ 
and tests. Personnel Admin., 1957, 20, 34-40. 

Yerbury, E. C, Holzburg, J. D., & Allessi, S. Psy” 
chological tests in the selection and placement 0 


psychiatric aides. Amer, J, Psychiat., 1951, 108, 
91-97, 


Journal of Applied Psychology 


Vol. 43, No. 1, 1959 


CONTRAST AND CONVERGENCE EFFECTS IN 
RATINGS OF FOODS ** 


JOE KAMENETZKY 


Quartermaster Food and Containe 


A major requirement in establishing the ac- 
ceptability of foods for military use is that of 
taste-testing. The major purposes of these 
tests are to evaluate samples submitted by 
food processors intending to bid on procure- 
ment contracts, to determine the effects on 
Preference of certain processing variables, and 
to assess the degrees of liking for new foods. 

In connection with this service function, a 
considerable amount of criterion and meth- 
odological research on affective evaluations 
has been in progress for several years. For 
the most part, the rationale for the specific 
Problems investigated has been based on em- 
pirical and practical considerations, and the 
results have proved useful in improving the 
reliability and interpretability of taste-test 
data, However, it was felt that increasing 
emphasis on theory will lead to greater inte- 
gration of findings and will facilitate applica- 
tions of methodological research. 

A simple model was developed as a starting 
point. The research strategy was to set forth 
tentative and perhaps over-simplified assump- 
tions based largely on observations of the test- 
ing process and subsequent results, to derive 
and test hypotheses, and to revise the model 
accordingly. The broader implications of the 
assumptions and results will be discussed 
later. 


Assumptions ; 
1. An individual evaluating a given food 


item in terms of like and dislike bases his 
undertaken at the 


1 This paper reports research a 

Wartermaster Food and Container Institute for the 
Armed Forces, and has been assigned Number ae 
AN the series of papers approved for publication. The 
views or conclusions contained in this report are 
those of the author. They are not to be construed 
as necessarily reflecting the views or endorsement of 
the Department of Defense. . 

2 The writer expresses his appreciation to Norman 
Gutman, Chief, Statistics Branch, ‘and Bettye John- 
son, Statistical Clerk, for their valuable assistance. 
Special acknowledgment is made to Phyllis Whitmer 
and Audrey Beauvais, Home Economists, Food Ac- 
ceptance Branch, who performed almost all the labo- 


ratory work. 
4 


y Institute for the Armed Forces 


judgment on the presence or absence of sev- 
eral characteristics of the food. 

2. Characteristics of foods are of two types: 
negative and positive. For any food, charac- 
teristics of both types are likely to be present. 

3. (a) Noticing the absence of a positive 
characteristic will result in a lower preference 
rating for the food. 

(b) Noticing the presence of a negative 
characteristic will result in a lower preference 


rating for the food. 

4. (a) When the positive characteristics of 
a good quality food (a “good”) predominate, 
the presence of some of the negative charac- 
teristics is not noticed or taken into consid- 
eration. 

(b) When the negative characteristics of 
a poor quality food (a “poor”) predominate, 
the presence of some of the positive charac- 
teristics is not noticed or taken into considera- 
tion. 

5. (a) Presentation of a “poor” increases 
an individual’s awareness of the presence of 
some of the same negative characteristics in 
a “good.” 

(b) Presentation of a “good” increases 
an individual’s awareness of the absence of 


some of the same positive characteristics in 
a “poor.” 

6. (a) As successive samples of a “poor” 
are served, forgetting the absence of some 
positive characteristics takes place. 

(b) As successive samples of a “good” 
are served, forgetting the presence of some 
negative characteristics takes place. 


Experimental Im plications 


It can be shown that the above assump- 
tions lead to the following predictions. 

1. A “poor” will be rated lower when pre- 
ceded by a “good” than when it is preceded 
by another “poor.” Consequently, the dif- 
ference in mean preferences between a “good” 
and “poor” should be larger when the ratings 
of the “poor” are obtained after a “good” 


7 


48 


than after another “poor.” 
called contrast. 

2. A “good” will be rated lower when pre- 
ceded by a “poor” than when it is preceded 
by another “good.” Thus, the difference in 
mean preferences between a “good” and 
“poor” should be smaller when the ratings of 
the “good” are obtained after a “poor” than 
after another “good.” This effect is called 
convergence. 

3. Preference will increase with successive 
servings of the same quality, provided no op- 
posite quality intervenes. 


Method 


Four independent replications of the experi- 
ment were conducted on four separate days 
between August, 1957, and January, 1958. 


Foods 


The food tested in the first 
cherry beverage made from 
base. The food tested in the s 
was a beef broth Prepared from a granulated base. 
On the basis of results from Previous taste tests, 
“good quality” lots of each food were selected. 
“Poor quality” samples of each food were prepared 
as follows: For the cherry beverage on the first rep- 


This effect is 


two replications was a 
a concentrated liquid 
econd two replications 


Joe Kamenetzky 


Table 1 


Orders of Presentation and Qualities Presented 
for Four Treatments 


Order of Presentation 


Treatment First Second Third Fourth 
1 Good Poor Good Poor 
2 Poor Poor Poor Good 
3 Good Good Good Poor 
4 Poor Good Poor Good 


lication, 44 ml. of vinegar and .55 g. of caffeine were 
added to the standard ingredients of 570 g. of sugal, 
4500 ml. of water, and 95 ml. of beverage base. On 
the second replication, adulterating ingredients wer? 
13.4 g. “liquid smoke” and 10.8 ml. vinegar. For 
the beef broth on both replications, the powder was 
partially burned, thereby producing a definite acrid 
taste. Since the burning could not be precisely con- 
trolled, the tastes of the “poors” on the two repli- 
cations were not identical, The holding and serving 
temperatures of the beverage was 50° F., and of the 
broth, 100° F 


Judges 


The judges (Os) were randomly drawn from # 


larger pool of about 700 civilian and military, both 


Table 2 
Treatment Means for Each Replication» b 
a 
Order of Presentation 
Treatment Replication First Second Third Fourth 
1 1 7.3 (G) 3.4 (P) 7.3 (G) 3.9 (P) 
2 6.8 3.0 7.0 2.5 
3 7.5 3.4 7.5 3.1 
4 oe 3.5 6.1 4.2 
2 
: 3.5 (P) 4.2 (P) 5.1 (P) 6.9 (G) 
4.5 6.3 6.5 6.4 
: 4.6 5.2 5:2 7.0 
6.7 6.1 5.9 5.5 
3 1 
g 6.8 (G) 6.9 (G) 6.7 (G) 3.3 (P) 
s 1.4 6.8 6.9 1.9 
7.9 7.4 7.5 4.1 
4 7.1 6.9 6.8 3.2 
4 1 3.6 (P) 6.1 (G) 3.3 (P) 6.5 (G) 
2 3.7 6.7 3.8 6.8 
3 3.8 7.9 29 68 
4 6.0 6.5 4.2 61 
= si oes = - ak ‘i — 
pechieh rating signifies a high preference, j 


P” represent “Good” and “Poor,” respectively, 


r 


Contrast and Convergence in Ratings of Foods 


Table 3 


Analyses of Variance of Preference Ratings 


49 


First Second Third Fourth 
Replication Replication Replication. Replication 
(Cherry Beverage) (Cherry Beverage) (Beef Broth) (Beef Broth) 
Source of Mean Mean Mean Mean 
Variation df Square p Square p Square $ Square $ 
1. Poor, 2nd 
(Treatments 1 vs. 2) 1 3.2000 54.4500 <.01° 16.2000 <.05° 33.8000 <.01° 
2. Good, 2nd - 
(Treatments 3 vs. 4) 1 3.2000 .0500 1.2500 .8000 
3. Poor, Ist & 3rd z B 
(Treatments 2 vs. 4) 1 7.2250 30.6250 <.05" 24.0250 <.05? 14.4000 
4. Good, Ist & 3rd 
(Treatments tys.3)? 1 3.0250 6250 -4000 .9000 
5. Ist & 2nd P 
vs. 3rd & 4th 1 9000 8.1000 <.05! 8.1000 <.05¢ 40.0000 <.01¢ 
6. Ist & 3rd 
vs. 2nd & 4th 1 3.6000 22.5000 <.01" 2.5000 40.0000 <.014 
T. Ast & 4th ` 
vs. 2nd & 3rd 1 9000 32.4000 <.014 3.0250 0 
8. Good vs. poor 1 366.0250 <.01° 3249000 <.01° 462.4000 <.01¢ 96.1000 <.01° 
P dadh 2000 1.8000 
i (Treatments 2 vs. 4) 1 8000 -8000 i , 
0. Poor, 4th 5.0000 5.000 
4 (Treatments 1 vs.3) 1 1.8000 1.8000 5. äi 
ip an vs. poor, 
st & 2nd. vs. ss 
3rd & ath = 1 2250 2.5000 .0250 1.6000 
12, pond vs. poor, 
st & 3) 3 zi 
2nd & ps i 6250 8.1000 <.05° «2250 3.1000 <.05% 
i 3 
3. ne vs. poor, 
st & 4th vs 
a 2h & 3rd 1 3.0250 28.9000 <.01" 0 -4000 
ss fom, Ist & 3rd; 
st x na 
ce Sear S i 9.0250 <.01° 9.0250 <.01° 5.6250 <.05 2.5000 
i r 
3 Saeg, Ist & 3rd; 
St & 4th vs. 1.6000 
1 and & 3rd 1 0250 1.2250 a00 i 
6. Jud . 
a 7 4.9292 5.1472 
T (within groups) 36 6.1361 4.7903 
k Judge-treatment 1.5106 1.6495 1.8306 
(within groups) 108 1.4546 : 
‘8. Total 159 


8 Ordi 
) prdinal values refer to po 
crror tern uated against judge 


A Qne-tailed test. 
Wo-tailed test. 


sition of sample. J 
(within groups). 


n al 


1 other comparisons, judge-treatm 


nent (within groups) interaction is the 


50 Joe Kamenetzky 


male and female, employees who regularly partici- 
pate in taste tests. Departures from randomness oc- 
curred when some were absent or were otherwise 
not available on the days the tests were conducted. 
Separate selections of 40 Os were made for each of 
the four replications. 


Procedure 

Each group of 40 Os was randomly assigned to 
one of four treatments that differed in the order in 
which the different qualities were presented and the 
number of each quality rated. The nature of the 
treatments is summarized in Table 1. 

Each O sat in a semienclosed testing booth. Two- 
ounce samples in coded cups or glasses were pre- 
sented one at a time through a turntable in a wall 
separating the booth from the kitchen. O drank as 
much or as little as he wanted of each sample, rated 
the product on a nine-point scale described elsewhere 
(Peryam & Pilgrim, 1957), and rinsed his mouth 
ad libitum with charcoal-filtered distilled water. The 
time between the rating of one sample and the pres- 
entation of the next was 45 seconds, 

On the first replication, the O’s, after rating the 
third and fourth samples, were asked to list the posi- 
tive and negative characteristics of each sample. 
Their lack of ability to so verbalize led to the de- 


cision to discontinue these questions on the later 
replications, 


Results 


The preference means of each treatment on 
each replication are given in Table 2. An 
analysis of variance of the preference ratings 
was performed for each replication separately, 
and the results of these analyses are presented 
m Table 3. In each case the ratings of the 

poors” were clearly lower than the ratings of 
the “goods” (Source of variation No. 8). 


Contrast Effects 


Inspection of Table 2 reveals that in every 
replication, the average rating of “poor” in 
the second Position was lower when it was 
preceded by a “good” than when it was pre- 
ceded by another “Poor” (Source of variation 
No. 1). Three of the four differences were 
significant at either the .05 or .01 level. Since 
the combined probability (Wilkinson, 1951) 
is less than 001, it is concluded that contrast 
effects have been demonstrated, 


Convergence Effects 


Tables 2 and 3 fail to show any consistent 
or significant (Source of variation No. 2 
difference in the ratings of the “ 


oods” re- 
gardless of whether a “ : r 


poor” or “good” pre- 


ceded. The prediction regarding convergence 
effects was not confirmed. 


Effect of Successive Presentations 


| 
It was predicted that as the “poors” are 
successively presented, the ratings would in- 
crease, but would not increase if a “good” 
intervenes. This means that the algebraic 
difference between the third and first samples 
should be greater for those in Treatment 2 
than for those in Treatment 4. The compari 
son of Source of variation No. 14 with the 
judge-treatment interaction constitutes the 
appropriate test of this prediction. Signifir 
cant differences at the .05 or .01 levels were 
attained for three replications; for the re 
maining replication, the results were in tbe 
expected direction though not significant: 
The combined probability is less than .00!: 
The hypothesis may be considered confirmed: | 
It was also predicted that as the “goods 
are successively served, the ratings would it 
crease, but would not increase if a “poor” in 
tervenes. Similar to the preceding predic 
tion, the algebraic difference between thé 
third and first samples should be greater fo! 
Treatment 3 than for Treatment 1, How 
ever, when Source of Variation No. 15 w% 
tested against the judge-treatment interactio™ 
no significant effects emerged. Inspection 0 
Table 2 shows that the differences were n0 
always in the expected direction. 

| 

4 


Hence: 
there is no support for the hypothesis. 


Other Tests of Significance 


The significance of Source of Variation N0 
5 shows that the later samples are prefert 
more than the errlier ones, However, th 
and other significant sources of variation 4" 
of only incidental interest here and will 2? ` 
be further discussed. 


Discussion 


Both predictions concerning changes l 
preference for the “poors” were substantiate 

and both predictions involving changes 4 
preference for the “goods” were not. In fac 
the ratings of the “goods” remained alm 
Mvariant regardless of the nature or numb | 


t 
A samples preceding them. It is possible tht 
the 


chara 


ds” tiv? 
goods were so good that the nega’ p 


cteristics, present to a marked degre” | 


J 


Contrast and Convergence in Ratings of Foods 51 


as were completely absent or below 
Os on - Indeed, on the first replication, 
honk “4 unable to specify anything negative 
Scteristi e “good.” Absence of negative char- 
6(b stics makes Assumptions 3(@), 5(4), 
ha and part of Assumption 2 inapplicable. 
ee definitive results could be ob- 
iadenender with the accompaniment of an 
a ent assessment of the presence and 
tenet e of both positive and negative charac- 
item Cs. However, present methods for de- 
ree these characteristics are not satis- 
rellabh primarily because judges are unable 
itrosa and independently to describe their 
is Ree experiences. Currently, research 
ie on £ considered on pychometric multivari- 
with ethods for inferring these characteristics 
out resort to verbalizations by the judges.” 
a the positive and negative charac- 
this = were not independently established, 
Sie cannot be considered to be a 
A ial test of the validity of the assumptions. 
ety from the consideration that develop- 
"i a methods for assessing these charac- 
sum ics will enable a rigorous test of the as- 
tie ee tentative retention of the assump- 
S set forth here should prove useful. 
„First, it is advocated that the following ad- 
‘tonal assumption be added: Presentation of 
z Sell increases an individual’s awareness 
a“ € presence of positive characteristics in 
aon i.e., the individual doesn’t apprect- 
a the excellence of a “good” until the ab- 
Sence of the positive characteristics 10 the 
. Por” makes him cognizant of their presence 
Ma “good.” 
TU there is no independent determina- 
chic of the individual’s perception of the pres- 
i of Positive and negative characteristics, 
Nelusion of this assumption would preclude 
oe of certain predictions, such as the 
e es regarding convergence effects 1n the pres- 
nt experiment. On the other hand, even 
en there is no independent determination, 
Predictions of more complex phenomena can 
© made. 
gma se for example, four types o hd isd 
= R pop: carbonated, slight © ; 
Was var Rg oie of sy chovreference tests 


rein departure: lates of order 
ae s from the postulates 
{{rasitivity and asymmetry) were evident among 


Gif 


i 
erent samples of the same product. 


etric research 


carbonated, marked off-flavor; noncarbon- 
ated, slight off-flavor; noncarbonated, marked 
off-flavor. If it is independently demon- 
strated that most people prefer the carbon- 
ated beverage over the noncarbonated one 
and that the slight off-flavor beverage is pre- 
ferred to the marked off-flavor one, it can be 
shown that at least seven predictions are de- 
ducible. For example, it would be predicted 
that a carbonated, slight off-flavor sample 
will tend further to depress the ratings of a 
carbonated, marked off-flavor sample that fol- 
lows it; and a noncarbonated, marked off- 
flavor sample should have the opposite effect. 
Testing of these predictions is contemplated. 

‘Another reason for at least tentatively re- 
taining the previous assumptions is that they 
focus attention and may provide answers to 
problems facing manufacturers of consumer 
goods. Consider the case, for example of a 
manufacturer of a hi-fi component which is 
well liked by its users. Suppose, also, that he 
is considering adding to his line an improved 
component, which, because of its extra cost, is 
not expected to be bought by as many people. 
The question arises: Will the new component 
cause a decreased preference level for the 
older one with a consequent reduction in sales 
for it, or will the effect be mainly one of in- 
creased liking for the new and no change in 
liking for the old? 

Similarly, when those who have had pleas- 
ant experiences with such optional automobile 
equipment as automatic transmissions and 
power brakes are in the market for a new Car, 
their preference level for autos without these 
accessories may decline, while their preference 
levels for autos with these remain constant. 
If level of preference is related to willingness 
to buy and if such a decline in preference oc- 
curs, then those who are able to afford just 
the basic auto might rather forego its pur- 
chase or buy a used one with these extras. 

Thus, in many cases where contrast effects 
appear, it is important to determine whether 
preference for the “good” rises or whether 
preference for the “poor” declines. To the 
extent the model is able to predict which— 
the “good” or the “poor”—is responsible for 
the increased or decreased differences in pref- 
erence, its bearing on marketing problems in- 


creases. 


on 
N 


Summary 


A set of assumptions was made that led to 
the hypothesis that preference ratings for poor 
quality food will be lower when preceded by 
a good quality food than when preceded by 
another poor quality item (contrast effects). 
It was also hypothesized that preference for 
a good quality food will be higher when pre- 
ceded by another good quality item than 
when preceded by a poor quality product 
(convergence effects). The other predictions 
were that preference will increase with suc- 
cessive presentations of the same quality item, 
provided no opposite quality intervenes. The 


Joe Kamenetzky 


predictions concerning preference for the poor 
quality foods were clearly confirmed, but 
those involving the good quality foods were 
not substantiated. Experimental and prac- 
tical implications of the assumptions and re- 
sults are discussed. 


Received May 1, 1958. 


References 


Peryam, D. R, & Pilgrim, F. J. Hedonic scale 
method of measuring food preferences. Food 
Technol., 1957, 11(9), Supplement, 9-14. 

Wilkinson, B. A statistical consideration in psycho- 
logical research. Psychol. Bull., 1951, 48, 156-158 


Journal of Applied Psy 
Vol. 43, No. t, 1959 0 


DISCUSSION AS A FUNCTION OF ATTITUDES 
AND CONTENT OF A PERSUASIVE 
COMMUNICATION: 


ELLIOTT McGINNIES axp IRWIN ALTMAN °? 


University of Maryland 


Pe ra investigations have demon- 
initial i e importance of considering the 
recipie hie ng between the opinions of the 
tion Es s of a communication and the posi- 
ani oe by the communication. Hoy- 
those Peer and Sherif (1957) report that 
ities on the stand taken by a com- 
factual ion judged it to be more fair and 
tion th as well as closer to their own posi- 
munic = did those disagreeing with the com- 
attitud, ion content. Persons closer in their 
teii oe to the communication content also 
l to change more in that direction 
Sane further removed from the 
ee Unicator’s position. Utilizing discus- 
ins D situations contrasted with nondis- 
(1958) situations, Mitnick and McGinnies 
as es found that shifts in attitude as well 
Were ount of information learned from a film 
kia greater among the members of discus- 
groups than among passive audiences. 
tion Ss who disagreed with the communica- 
can Pia however, this effect was signifi- 
tly less in the discussion groups than in 
uge ndiscussion groups, indicating that atti- 
iall change was being influenced differen- 
Ted in discussion according tO initial atti- 
acta, of the discussants. The amount of 
Was al material retained after film viewing 
ont Positively related to agreement with the 
€nt of the film in all groups. 
Sloe examining the rather extensive discus- 
<2 Protocols collected by Mitnick and Mc- 
fee we were impressed with certain ap- 
ent discrepancies between the various atti- 
be © groups with respect to their discussion 
avior. Further analysis of these data with 
his rese . in part by a special 
ion om earch as unpor of Mental Health, 
Ser oral Institutes of Health, U. S. Public Healt 


a “aie 
‘hi Leonard Mitnick very kindly made avail- 
e data upon which the present analysis is 


arch, Inc., Ar- 


t 


attention to several quantitative features of 
the discussions promised to shed light upon 
the interaction between attitudes and com- 
munication content as a factor determining 
the discussion process itself. This line of ap- 
proach focused upon the verbal behavior of 
the communication recipients rather than 
upon the attitudinal consequences of the ex- 
perimental procedure. 

A number of quantitative procedures de- 
signed to explore discussion behavior have 
been reported. One of the earliest of these 
was proposed by Chapple (1940) who was 
concerned with the frequency and duration 
of verbal behavior. His method, described 
by Heyns and Lippitt (1954) as “an essen- 
tially contentless observational system,” re- 
lied upon measurements of certain temporal 
aspects of interpersonal communication which 
Chapple felt could be used as predictors of 
other features of the interaction process with- 
out depending upon inferences about the mo- 
tives of the participants. This type of pro- 
cedure contrasts with that developed by 
Bales (1950), as well as with those described 
by Bradford and French (1948). Several in- 
vestigators, however, have followed Chapple 
in emphasizing a formal or a quantitative 
mode of analysis of verbal behavior in discus- 
sion groups. Stephan and Mishler (1952) 
found that a simple exponential function ade- 
quately describes the distribution of partici- 
pation among the members of small groups. 
A study by Findley (1948) focused upon dis- 
cussion participation, and the author presents 
a statistical index which is maximized when 
participation is equally distributed and mini- 
mized when two individuals monopolize the 
discussion. Dickens (1953) has also proposed 
a measure of the spread of participation in 
group discussion. 

In the present paper, 
dices of group discussion 


several statistical in- 
behavior are de- 


54 Elliott McGinnies and Irwin Altman 


scribed which should prove applicable in 
many types of discussion situations and 
which are descriptive of group reactions to 
a controversial communication. Although 
measures of this general type have been used 
previously, this essentially statistical ap- 
proach to discussion behavior has not been 
related to the known attitudinal character- 
istics of target audiences. As will be indi- 
cated, attitudinally homogeneous groups differ 
in several specific respects when they discuss 
a communication that is related to their ini- 
tial views. 


The Experiment 


Measures employed. Five statistical in- 
dices of group discussion behavior were se- 
lected on the basis of prior research as being 
descriptive of important aspects of the dis- 
cussion process. The measures were: (a) 
verbal output, (b) participations, (c) rate of 
response, (d) spontaneity, and (e) recruit- 
ment. The operations determining each of 


these measures will be described in reporting 
the results, 


in modifying 
- Briefly, their study investj- 
gated the effects of a film, The High Wall, upon the 
Tesponses of high school students to an adaptation 
of the California ethnocentrism (E) scale, Some of 
the groups discussed the film, while others merely 
viewed it. We shall be concerned here only with 
the discussion groups. he basis of Pretesting 


with the E scale, three types of discussi 
were formed. T a’ aoe 


scores, i 
Ss wit 


groups, it was as- 
antagonistic to the 
Ss Were favorable to the 


were balanced 


bers, and no significant 
relative socioeconomic si 


3 We are grateful to Willard Vau 


V ri han, wh 
as discussion leader for the groups, SERS OKAS teal 


Two groups were formed for each type of prei 
position, so that a total of six discussion gro o 
containing 54 members, furnished the data hee 
which the present findings are based. The dist 
sions were limited to a half-hour, and all were a 
recorded for later analysis, The findings ne 
to attitude change have been reported elsew! ia 
(Mitnick & McGinnies, 1958). We will limit om 
selves in this paper to descriptions of the discuss! | 
in the three Predisposition groups in terms of 
measures developed. 


| 
Results 


| 
Verbal output. This measure is obtain! 
by simply counting the number of woe 
emitted by all members of the discussio 
groups. The data from the two groups 1 
resenting each degree of ethnocentrism A 
combined in order to give a more reliable i 
dication of behavior within the attitudit, 
categories. Inspection of the mean Me 
counts reveals significant differences in ver $e 
output among the six predisposition grouk 
The two most active groups in terms of D 
verbal productivity were composed of th eif 
Ss who were assumed on the basis of th j! 
low E scores to favor the communicatio 
Second to these individuals in verbal outP a 
were the Ss with high E scores, presume ast 
be antagonistic to the communication. Le 
productive in terms of verbal response k 
the Ss selected from the middle of the i 
distribution, Frequency distributions of ups 
bal output in each of the three attitude 81°" 
are presented in Fig, 1, A distinctly bim? ( 
distribution is Present in the case of the w 
Es, with the peaks occurring in the class ply 
tervals of 1-200 and 800-1000 words. e 5 
two of the 18 individuals in these two disca 
sion groups had nothing to say. Six Of py 
18 high Es declined to participate in the at 
cussion of the film, and relatively few ag 
found in the higher output categories. so 
nally, the middle Es were observed to Þe 


HIGH Es 


TOTAL WoRDS EMTTeD y 
+ 5 . T 
Distributions of verbal output unde 


three attitude conditions. 


Persuasive Communications 55 


Temely reticent in the discussion situation, 
me of them declining to speak at all. 
bi evident J shape in the case of the 
tin e E groups may be intepreted as indi- 
dut H constriction in the range of verbal 
RO 1 in moving from greater to lesser atti- 
he al involvement with the communication. 
wie oo numbers of words emitted by indi- 
Fee: | in the three predisposition conditions 
de RE - E — 450, high E — 296, and mid- 
ion — 175. Since the respective distribu- 
diffe, are badly skewed, the significance of the 
rences among these means was evaluated 
fae ea analysis of variance sug- 
Prob by Walker and Lev (1953). The 
Obability that the means were drawn from 
en population is less than 05. We may 
fore ude with the reasonable certainty, there- 
eee that the three degrees of ethnocentrism 
Presented by the experimental groups pro- 
Da significant differences in verbal out- 
> among these groups during 2 discussion. 
iis differences are reflected not only in the 
Seine but also in the relative skewness of the 
Soa distributions. i 
obt umber of participations. A more readily 
it ae measure of discussion activity, since 
A at 5 not depend upon transcriptions of the 
— discussions, is the number of dis- 
on, entries of an individual into the discus- 
to th This measure is taken without regard 
tici e length of a given participation. As an- 
pa ated, the distribution of participations i 
oupa on closely resembled that for verba 
aia Figure 2 summarizes these data. In 
om Sroups favorably disposed toward the 
ison cation, the number of entries into 
é ussion shows fairly wide dispersion. The 
of > number of participations by members 
Îvid, low E groups was 21.7, with some 1m- 
k uals speaking as many as 70 times. 
ri ong the high Es, the mean number of en- 
es into discussion was 17.0. Those 1m the 


LOW Es 


FREQUENCY 
eevee revee 


LZ 


er 
026 “Sogo 


the 


Fig 
2. Distributions of participations under 


three attitude conditions. 


middle groups who were disposed to speak 
participated on the average of 7.6 times. 
Nonparametric analysis of variance showed 
the means of the three groups to differ at the 
.05 level of significance. 

As in the case of verbal output, constric- 
tion of the range of participations is seen in 
moving from strong to weak attitudes. The 
rank-order correlation for the sample as a 
whole between verbal output and number of 
participations was .95. In view of the high 
correlation between these two measures, it 
would seem advisable in many situations to 
use participations alone as a measure of dis- 
cussion activity, since this index can be re- 
corded during the discussion without depend- 
ence upon an exact protocol. 

Rate of response. The number of responses 
emitted by an S during an interval of time is 
a widely used measure of performance ina 
variety of situations. Insofar as attitude pre- 
disposes the individual to a characteristic pat- 
tern of relevant behavior, the response rate 
of groups composed of individuals with known 
attitudinal biases should be reflected in this 
Rate of response in the six discus- 
sion groups was determined by summing the 
words emitted by both the leader and the 
group members during each of six time pe- 
riods and dividing by the length of the pe- 
riods in minutes. Since the discussions varied 
in length from 21 to 31 minutes, depending 
upon the willingness of the participants to 
continue, the measures describing temporal 
features of the discussions were plotted as a 
function of equal portions of the meetings 
rather than as functions of elapsed time. This 
procedure, which is analogous to plotting Vin- 
cent learning curves, made it possible to plot 
the behavior of all of the groups upon the 
same time base and to make direct compari- 
sons among them. The differences in dura- 
tion of the meetings are reflected in the meas- 
al output which, however, are not 
simply artifacts of these differences since all 
of the groups were permitted to use a half- 


hour period if they were so inclined. 
Inspection of Fig. 3 shows that response 


rates under the three conditions differ not 
only in level but in trend. The over-all lev- 
els of verbal activity throughout the course 
of the discussions are predictable from the 


measure. 


ures of verb 


56 


g 


Low Es 
w 180) è . 
2 D 7 
S «3 
S = HIGH Es 
o 3 è o 
y so 
o 
C 
wio a MIDDLE Es 
K . 
Œ ws Š 
90) 
<=; 2 y F g 
TIME PERIODS 


Fic. 3. Rate of response during discussion under 
the three attitude conditions. The equations for the 
regression lines are as follows: Low Es, Y = 5.08X + 
152.17; High Es, Y=2.81+ 136.21; Middle Es, Y 
= — 3.46X + 119.93. 


verbal output measures. However, the trends 
reveal interesting differences among the three 
degrees of ethnocentrism represented in the 
discussion groups. Regression lines were 
fitted to the data, and the respective b co- 
efficients were tested for significance. Both 
the low E and the middle E groups showed 
a significant regression of response rate upon 
time. The slope of the function for the high 
E groups did not differ significantly from 
zero, due to greater variability in response 
rate at different periods in the discussion. 
The low E groups increased their rate of re- 
sponse throughout the discussion, while the 
middle E groups decreased in rate of verbal 
activity during the meetings. 

The results indicate that rate of verbal re- 
sponse in a discussion situation may be pre- 
dicted from knowledge of the attitudinal dis- 
positions of the discussants. Those groups 
Composed of individuals with a favorable atti- 
tude toward the communication under discus- 
sion display not only a high initial rate of re- 
sponse but also a gradual increase in verbal 
activity as the discussion progresses. Indi- 
viduals who are antagonistic toward the com- 
munication show a lower over-all response 
rate, greater variability in their rate of verbal 
activity, and no significant change in rate 
throughout the discussion. Discussants who 
are neutral or undecided with respect to the 
point of view presented for discussion display 
a generally low rate of response which de- 
clines toward the end of the discussion period. 

Spontancity. In discussion situations char- 
acterized by permissive or nondirective lead- 


Elliott McGinnies and Irwin Altman 


ership, opportunity is afforded for the group 
members to display varying degrees of initia- 
tive in maintaining the flow of comments. 
This feature of discussion behavior may ap- 
propriately be referred to as spontaneity, OY 
the extent to which the discussion is carried 
by the group members without prompting 
from the discussion leader. In the present 
study the discussion leader avoided entering 
the discussion except on those occasions when 
activity lagged seriously or when he was 
posed a direct question. A remark by 4 
group member which occurred without im- 
mediate prior comment from the leader was 
coded as spontaneous. A series of partici- 
pations initiated by the group members with- 
out intervening leader participation, there- 
fore, were scored as spontaneous, whereas 
comments elicited by the leader were scored 
as nonspontaneous. 

In order to adjust for differences in total 
verbal output of the groups a spontaneity 
index for each time period was computed a$ 
follows. The number of comments judged a5 
spontaneous during each time interval was 
divided by the total number of participations 
for that period. This ratio was then multi 
plied by the percentage of that particulat 
group’s contribution to the verbal output of 
the six experimental groups combined. This 
procedure effectively weighted the spontaneity 
ratios according to the over-all verbal pro” 
ductivity of the attitudinal groups concerne? 
Points representing the level of discussio” 
spontaneity in the three predisposition condi 


EJ 


LOW Es 


SPONTANEITY 
8 
; 
. 
i 


MIDDLE Es 
A 
4 a 
At... 
TIME PERIODS 


. A . ê 

Fic. 4. Trends in discussion spontaneity under e 

three attitude conditions. The equations for the a 
gression lines are as follows: Low Es, Y = 1.83 


29.54; High Es, Y =2.36X + 15.24; Middle Es 
= 69X + 3.78. 


-SCo; 
Tes for the two extreme groups. 


Persuasive Communications SF 


tions are plotted by time periods in Fig. 4. 
Regression lines have been drawn through the 
three sets of data to show more clearly the 
group trends. Several features of the dis- 
Cussions are revealed. First, it is clear that 
well-defined biases toward the communication, 
Whether positive or negative, are associated 
With higher levels of discussion spontaneity 
than are indeterminant attitudes. The low 
Es begin the discussion with relatively little 
Prompting from the leader and increase gradu- 
ally in spontaneity until the end of the dis- 
Cussion period. This is the only significant 
trend among the three experimental condi- 
tions. Although commencing at a low level of 
Spontaneity, the high Es appear to become 
Progressively more spontaneous as the meet- 
ing continues, although the regression effect 
fails to reach an acceptable level of confi- 
dence (P < .10 > .05). It should be inter- 
ae to determine whether this apparent 
t Covery from an initially low degree of spon- 
aneity in the high E groups is in part due 
y the type of nondirective leadership pro- 
ided, Having discovered that their antago- 
nisms toward the communication will be ac- 
cepted, perhaps the high Es generate a greater 
degree of spontaneity than they would exhibit 
Were the leader to reinforce the content of 
© communication. 
embers of the two neutral groups were 
ately able to produce any spontaneous Te- 
Fam early in the discussion, and they 
Owed but little improvement in this respect 
siti he Sessions continued. Their central po- 
a, in the distribution of ethnocentrism 
of res does not, of course, indicate the nature 
their ethnocentric bias as clearly as the 
Depend- 
tine upon the discussion leader in this situa- 
Sa n may reflect uncertainty about what to 
Y in response to a communication that takes 
t x Ositive position on matters about which 
i Y are undecided or toward which they are 
Pathetic, 
ecruitment. Since the discussion group 
bers were identifiable in terms of their 
and seat designations it was possible to 
ermine from examination of the transcripts 
the act time at which any individual entere 
Meth iscussion. A full description of the 
od employed is reported by one of us 


elsewhere (McGinnies, 1956). We have de- 
fined the cumulative rate at which new indi- 
viduals take part in the discussion as recruit- 
ment. In Fig. 5 we have plotted the re- 
cruitment functions of the discussion groups 
under the several conditions of ethnocentrism. 
Consistent with the other temporal measures, 
the low E groups are seen to have the most 
rapid rate of recruitment as well as the high- 
est final level of group participation. The 
high Es are recruited less rapidly into the 
discussion and reach a peak level somewhat 
later in time. Starting at an even lower 
point, the middle Es are recruited at about 
the same rate as the high Es, but they level 
off relatively early. The terminal level of 
group participation achieved in the three 
situations are: low Es-89%; high Es-67%; 
middle Es-50%. Comparison of these per- 
centages by a £ test for independent propor- 
tions shows the low Es reaching a significantly 
higher final participation level than the mid- 
dle Es (P < .02). The other differences were 
not significant. 

These relationships suggest that the rate 
of recruitment of small group members into 
a discussion is predictable from knowledge of 
their attitudes toward a communication. 
Agreement with the communication content 

litate early entrance of poten- 


appears to facil ‘ n l 
tial participants into the discussion, with few 


additional discussants being recruited after 
the first half of the discussion period. Even 
a negative attitude toward the communication 
produces greater over-all participation than 
an indeterminate bias, although the rate of 
recruitment is about the same In both cases 


a Low Es 


2 
S 
-EN 


HIGH Es 


RECRUITMENT 
> 2 
è s 
E a 


Y 
Š 
—_ os 


i s 


3 4 
s TIME PERIODS 
partici- 


itment of discussion 
Fic. 5. Rates of recruitme! “A 


pants under the three attitude conditions. 
curves are fitted by inspection. 


58 


and, again, reaches a maximum approximately 
half-way through the discussion.* 


Discussion 


It has been demonstrated that five statisti- 
cal measures differentiate consistently among 
discussion groups with different attitudinal 
sets. These measures have the advantages of 
being both objective and reliable. Do they 
contribute anything to our understanding of 
discussion behavior that could not be ob- 
tained through more qualitative techniques of 
analysis? 

Bales (1950) has made a useful and sig- 
nificant distinction between the topical and 
the process content of discussion material. 
This classification distinguishes between what 
is said in a discussion and how it is said. A 
system of “interaction process analysis” de- 
vised by Bales provides categories for the 
coding of verbal units according to their dy- 
namic significance in the discussion. Remarks 
are considered to indicate problems of orien- 
tation, evaluation, control, decision, tension- 
management, and integration. In an attempt 
to determine whether predispositions of the 
discussants would influence the discussion 
process as conventionally measured, we ap- 
plied Bales’ interaction process analysis to 
one discussion from each of the three attitude 
groups. The results were similar to those 
that we have obtained with this method when 
used with the discussions of groups viewing 
other mental health films. Namely, there 
was an accumulation of comments in Cate- 
gories 5 and 6 of Bales’ system, which are 
scored as “giving opinion, evaluation, orienta- 
tion, and information,” so that the profiles 
of the several groups were essentially similar. 
That the different groups were reacting sig- 
nificantly to the film content, however, is 
clearly revealed in the “contentless” indices 
described. Those groups with strong, al- 


4 Although based upon a relatively small NW in the 
present instance, the recruitment functions shown in 
Fig. 5 are consistent with those obtained in prior re- 
search with a large number of groups. Recruitment 
of discussion participants in small groups dealing 
with a congenial topic levels off at about 80% 
shortly after the first half of a 30-minute discussion 
period. In Jarge groups, numbering up to 90, re- 
cruitment proceeds in linear fashion throughout the 
meeting but does not exceed 30% in a half-hour 
period. 


Elliott McGinnies and Irwin Altman 


though opposite, attitudes differ clearly in 
their behavior from neutral groups. 

One other possible interpretation of the 
results must be entertained. This lies in the 
fact that the California E scale correlates 
highly with authoritarianism. It is conceiv- 
able, therefore, that the differences in the 
discussion measures among the experimental 
groups might be attributable to personality 
differences rather than to ethnocentric atti 
tude. Bass (Bass, McGehee, Hawkins, Young, 
& Gebel, 1953), for example, has reported 


that girls who score low on the F scale have & 


higher leaderless-group-discussion scores than 
those who score high. On the other hand; 
Rokeach (1948) reports that verbalization 
scores during problem solving are much higher 
for high Es than for low Es, using problems 
unrelated to ethnocentrism. Rokeach (1956) 
also showed that both highs and lows score! 
high on a scale of dogmatism, suggesting tha 
they might react in similar fashion to some 
situations. Our results bear out this by” 
pothesis but, unlike Rokeach’s earlier fint 
ings, they indicate the low Es to be more 
verbal. In a study of the relationship Þe 
tween personality predisposition and behavio 
in groups, Haythorn (Haythorn, Couch, Haef- 
ner, Langham, & Carter, 1956) found that 
four-man discussion groups composed of au 
thoritarians differed in some respects from 
groups of equalitarians. Of 16 postmeetin? 
observer ratings, however, only one discrim" 
nated significantly between the high F and 
low F groups. There is little evidence, the?’ 
that authoritarianism, as such, is a sign” 
cant variable. determining those aspects 
group discussion behavior that we have 
ported. 

The essential conclusion from our dat? 
would seem to be that well-defined attitude 
toward a persuasive communication, regan 
less of direction, are associated with a gen 
ally higher level of discussion activity. «4 
distinctly favorable attitude is reflecte? . 
greater verbal output, a progressively incre? d 
ing rate of response, a high and accelera” t 
degree of spontaneity, and rapid recruit”, 
of participants. Antagonistic discussants ran 
second in these respects, while neutral rec} y 
ents of the communication reveal their 9... 
thy (or indecision) in all of the meast" 


Persuasive Communications 


poe data supplement the usual measurements 
attitude change following a persuasive com- 
oe by reflecting social behavior rather 
a ile to a questionnaire. They indi- 
Ben e feasibility of predicting group reac- 
soars a communication when the initial 
they es of the members are known. Finally, 
i Sa additional operational meaning 
hen concept of attitude” as a predisposi- 
i respond in consistent fashion to rele- 

stimulation at the level of group analysis. 


Summary 


e relatively precise “contentless” meas- 
applied Ai group discussion situation were 
groups : the discussion protocols of six small 
cordin of high school students, formed ac- 
he & to degree of personal ethnocentrism. 
igh noe designated as low, medium, oF 
a film road predisposition, discussed 
Prejudi at attempted to explain and liberalize 
differenc, toward minority groups. Consistent 
Centrist ces among the three degrees of ethno- 
Were ok represented in the discussion groups 
avorabl €cted in the five indices. Those Ss 
content y disposed toward the communication 
activit showed a greater degree of discussion 
antagon and spontaneity than did Ss who were 
Cation nistic or neutral toward the communi- 
differe, These statistical measures served to 
erentiate the groups where other, more 


R jective, analyses failed. 
eceived May 5, 1958. 


Bal References 
es 
or T F. Interaction process analysis: A method 
di te study of small groups. Cambridge, Mass.: 
'Son-Wesley, 1950. 


59 


Bass, B., McGehee, C. R., Hawkins, W. C., Young, 
P. C., & Gebel, A. C. Personality variables related 
to leaderless group discussion behavior. J. ab- 
norm. soc. Psychol., 1953, 48, 120-128. 

Bradford, L., & French, J. R. P. The dynamics of 
the discussion group. J. soc. Issues, 1948, 4, 2-8. 

Chapple, E. D. Measuring human relations: An 
introduction to the study of the interaction of 
individuals. Genet. Psychol. Monogr., 1940, 22, 
1-147. 

Dickens, M. Basic principles of measurement in 
human relations as they apply to group discussion. 
J. Communications, 1953, 3, 11-13. 

Findley, W. G. A statistical index of participation 
in group discussion. J. educ. Psychol., 1948, 39, 
47-51. 

Haythorn, W. M., Couch, A., Haefner, D., Langham, 
P., & Carter, L. The behavior of authoritarian 
and equalitarian personalities in groups. Hum. 
Relat., 1956, 9, 57-75. 

Heyns, R. W., & Lippitt, R. Systematic observa- 
tional techniques. In G. Lindzey (Ed.), Hand- 
book of social psychology. Cambridge, Mass.: 
Addison-Wesley, 1954. Pp. 370-404. 

Hovland, C. I, Harvey, O. J., & Sherif, M. Assimi- 
lation and contrast effects in reactions to commu- 
nication and attitude change. J. abnorm. soc. 
Psychol., 1957, 55, 244-252. 

McGinnies, E. A method of matching anonymous 
questionnaire data with group discussion material. 
J. abnorm. soc. Psychol., 1956, 52, 139-140. 

Mitnick, L. L., & McGinnies, E. Influencing ethno- 
centrism in small discussion groups through a film 
communication. J. abnorm. soc. Psychol., 1958, 


56, 82-90. P 3 

Rokeach, M. Political and religious dogmatism: An 
alternative to the authoritarian personality. Psy- 
chol. Monogr., 1956, 70, No. 18 (Whole No. 425). 

Rokeach, M. Generalized mental rigidity as a factor 
in ethnocentrism. J. abnorm. Soc. Psychol., 1948, 
43, 259-278. 

Stephan, F. F., & Mishler, 
participation in small group 
proximation. Amer. sociol. Rev, 


608. A ; 
Walker, Helen M., & Lev, ne Statistical inference. 


New York: Holt, 1953. 


E. G. The distribution of 
s: An exponential ap- 
1952, 17, 598- 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


THE PREDICTABILITY OF RATINGS AS A 
FUNCTION OF INTERRATER 


AGREEMENT *”’ 


DONALD N. BUCKNER 


Human Factors Research, Inc., Los Angeles 


When ratings are used as criterion meas- 
ures, more ultimate criteria of performance 
are generally not available for validating 
them; indeed if more ultimate measures were 
available, ratings probably would not be em- 
ployed in the first place. It is necessary, 
therefore, either simply to accept the ratings 
as valid or to seek indirect indications of 
their validity. Two such indications are often 
employed. The first is the reliability of the 
ratings as shown by the amount of agreement 
among scores assigned the same ratees by 
different raters. The second is the predict- 
ability of the ratings, or the extent to which 
they correlate with measures to which they 
should be related, according to either the re- 
sults of logical analyses or previous research. 

The purpose of this study was to investi- 
gate the hypothesis that high agreement 
among the ratings assigned the same men by 
different raters does not necessarily imply 
predictable or valid ratings and that disagree- 
ment among raters may be associated with 
predictability and possibly validity, 

The hypothesis is based first on the as- 
sumption that ratee behavior in most perform- 
ance rating situations is not entirely consistent 
from one time to the next with respect to 
particular traits, primarily because no effort 
is made to control the physical and psycho- 
logical environment during the period the 
ratings are designed to cover. To be valid, 
then, ratings must reflect these inconsistencies, 

Even if ratees behaved entirely consistently 


1 This study was performed as part of a criterion 
research project supported by the Personnel and 
Training Branch, Psychological Sciences Division, 
Office of Naval Research, under Contract Nonr 
1241(00). Reproduction in whole or in part is per- 
mitted for any purpose of the United States Gov- 
ernment. 

2 Robert R. Mackie, Director of Research, and 
Albert Harabedian of Human Factors Research made 
valuable contributions to the conduct of the study 
and the preparation of this report. 


60 


with respect to particular traits regardless of 
the situation, ratings of them would not neces- | 
sarily be in agreement since raters use differ- 
ent criteria in rating on the same trait (Guil- ` 
ford, 1954, p. 295). The second assumption, 
then, is that these criteria employed by differ- 
ent raters are all valid and the differences 
in ratings reflected by them are also valid. 
This second assumption implies that part of 
achieving the ultimate in performance is satis- 
fying the demands of various superiors bY 
behaving in different ways. 

On the basis of these assumptions, high 
agreement among ratings could imply a poor 
sampling of observations of ratee behavior DY — 
raters, a poor sampling of raters in terms ° 
the criteria they use to evaluate particulat 
traits, or both. Disagreement among the 
ratings assigned to the same men by differe” 
raters, on the other hand, might indicate th# 
a more representative sample of observatio” 
and rater criteria was obtained. 

It is obvious on the basis of these t¥? 
assumptions that high interrater agreeme” 
could also indicate validity. If a ratee kne™ 
the criteria his superiors were going to empl? 
in rating him, for example, he could behave 
assuming he had sufficient control of his P® 
havior regardless of the environmental sit"? 
tion, so as to satisfy them. It is assume 
however, that the majority of men are "° 
entirely aware of the nature of their sup? 
riors’ criteria, and even if they are, the 
neither have adequate control over their 
havior in all situations nor are obsequio", 
enough to attempt to satisfy all of the! 
superiors’ demands. é 

To reiterate, the hypothesis tested in h 
present study was that high agreement amo 
the ratings assigned the same men by diff v 
ent raters does not necessarily imply predic’ 
able or valid ratings, | 


Predictability of Ratings 61 


Method 


fi The samples. A total of 171 men aboard 21 dif- 
erent submarines of the Pacific Fleet were rated in 
ees af four to nine by three of their superiors, 
o by two officers and one chief petty officer 
Sa Ei one officer and two CPO’s. Those men 
eae been aboard their assigned boats for a 
total a at least 10 months were selected from this 
There ample for the investigation reported here. 
were were 97 such men so an additional three men 
been momi selected from the group that had 
sample Yo an for nine months and added to the 

The bi it an even 100 ratees. a 
10 iial, ompetence (TC) trait scale con anni 
adjustment competence (TC) traits and 10 personal 
assigned nt (PA) traits was used. The ratings were 
small s; Fa bd scale of 25 hypothetical submariners, 
left to weotlike figures extending from the bottom 
let, Verb; top right of each page of the rating book- 
figures et descriptions were added to the extreme 
scribed well as the middle figure which was de- 
Each ae the ordinary submariner of his rate. 
trait at er assigned his ratings independently on one 
job cla à time and rated only men of the same 
a ea and rank at one time. 
trait vance means of the 10 TC and the 10 PA 
this aay assigned by each rater were used in 
three D, Since there were three raters, there were 
two class mean scores for each ratee on each of the 
ratee’s tole of traits. The means of these, i€» the 
traits, we al mean rating on the PA and on the TC 
to test ike see in the correlational analyses designed 
Variance exe and in computing the = 

Estimat AA the interrater agreement estima = 
cOte was o of interrater agreement. An agreeme 
of the as computed for each ratee; it was the sum 
about uaea deviations of the three rater men 

hese ne ratee’s total mean. The distribution of 
centiles ns Was divided at the 25th, 50th, and 75th 
igh S to yield four groups of 25 ratees each: the 
daa ne (HA), moderate agreement MA), 
ent ate disagreement (MD), and high disagree- 
er (HD) samples. Interrater agreement estimates 

© made for the four samples by using the sum 
term e agreement scores as the error variance 
t and the variance of the total mean ratings as 
oat total variance in the basic equation for the 

efficient of reliability (Guilford, 1950). ; 
ah 1S procedure of computing agreement sori D 
on, tee and dividing the sample so as to oe ae 
tenn levels of interrater agreement was iani a 

a tately for the TC and the PA ratings. The y 3 
5 ation between the two classes of ratings was a 
th there was overlap between the samples; nev a 
icon many of the ratees in the HA-TC sample, 

t example, were not in the HA-PA sample. 

The criteria of predictability. Three measures were 
ne to compare the predictability of the ae 
Ne Navy General Classification Test (GCT), 3 

avy Mechanical Aptitude Test (MECH), and the 

Ubmarine School Class Standing (SSCS). Previous 
esearch had shown that these variables were 5'8- 


m 
Ww 


nificantly related to performance aboard submarines 
as measured by ratings, check lists, and job sample 
performance tests (Mackie, Wilson, Buckner). SSCS, 
which is based on a composite of written achieve- 
ment test scores and instructor ratings and has an 
estimated reliability of .90, was found to correlate 
higher with scores on the shipboard criteria than 
any of a variety of predictor variables studied. It 
was selected from the measures available for this 
study, therefore, as the variable most likely to be 
related to the ultimate criterion and as probably the 
best indicator of the validity as well as the pre- 
dictability of the ratings. 
Comparability of the experimental samples. F and 
t tests were performed to determine whether or not 
the experimental samples could be assumed to have 
been obtained from the same population with re- 
spect to the variables relevant to the study. None 
of the differences between means or variances of 
scores on the three predictor variables was signifi- 
cantly different from zero. The differences between 
the means and variances of months on board were 
not significantly different from zero. The differences 
between the means and variances of the ratings as- 
signed the men in the experimental groups were not 
significantly different from zero; the mean of all 
ratings was 14.7 and the standard deviation was 4.3. 
The experimental samples were assumed on the basis 
1 tests to have been obtained from 


of the statistica 

the same population with respect to these variables. 
Correlational analyses. Scores on the three pre- 

dictor variables, sscs, GCT, and MECH, were 


correlated with the ratees’ total mean ratings. Sepa- 
tate analyses were performed for each of the experi- 
mental groups and for both the TC and the PA 
ratings. The score on SSCS actually used in the 
computations was the proportion of men in his sub- 

ss each ratee exceeded. Pearson 


marine school cla 
product-moment coefficients were computed. The 


correlations were computed using both raw and 
standard scores. The results were essentially the 
same. The raw score results are reported here. 


Results 


The interrater agreement estimates and the 
results of the correlation analyses are shown 
in Table 1. 

None of the correlations between scores on ` 
the predictors and either the PA or TC rat- 
ings was significantly different from zero in 
the agreement samples, HA and MA. All 
three of the correlations with the TC ratings 
in the MD sample were significant, the cor- 
relation between SSCS and the TC ratings 
being significant at the .01 level and the other 
two at the .05 level. SSCS was also signifi- 
cantly correlated (.05 level) with the TC 
ratings in the HD sample for which the inter- 


rater agreement estimate was .00. 


62 Donald N. 


Table 1 


Interrater Agreement Estimates and the Correlations Between Scores on the Predictor 
i Variables and the Total Mean Ratings 


(N = 25 ratees in each 


Buckner 


of the eight samples) 


SSCS 


GCT MECH 
Sample TC PA TC PA TC PA TC PA 
r 5 5 -23 .02 —.07 .16 
HA .94**  .88** .05 .05 
MA 84**  .90** -29 25 —.14 —.27 02 01 
MD or GIF .61** 08 43* —.12 42* 21 
HD -00 12 A3* 265** 7 A 18 06 


* Significant at .05 level. 
** Significant at .01 level. 


Only two of the correlations computed be- 
tween the predictor scores and the PA ratings 
were significant and both were with the rat- 
ings for which the interrater agreement esti- 
mate was lowest, .12 in the HD sample. The 
correlation between SSCS and the high dis- 
agreement PA ratings was significant at the 
01 level and the correlation between scores 
on the GCT and those ratings was significant 
at the .05 level. 

Additional analyses were performed in an 
effort to locate the source of the predictable 
variance in the ratings for which the inter- 
rater agreement estimates were low, i.e., the 
MD and HD groups. Only the means of 
the ratings on the 10 technical competence 
traits were used in these analyses. 

First it was hypothesized that the more 
extreme rating with respect to the mean of 
the three assigned a ratee was contributing 
more predictable variance than the other two, 
The procedure employed in testing the hy- 
pothesis was as follows: the ratings assigned 


Table 2 


Correlations Between Scores on the Predictor 
Variables and the One Disagree and the 
Mean of the Two Agree Ratings 


(N = 25 ratees, TC ratings only) 


Ratings SSCS 


GCT MECH 
One disagree .50* 50 .50* 
Two agree 35 30 .41* 


* Significant at .05 level. 
** Significant at .01 level. 


the 50 ratees in the combined MD and gD 
samples were plotted on a large chart. 
spection showed that all three raters 
agreed in their evaluations of some men. 
the case of others, two of the raters were k 
substantial agreement and only the third | i 
agreed. The 25 ratees (half of the combine 
HD and MD samples) for whom this latt 
pattern was most pronounced were select 
for study. Scores on the predictor varjab 
were then correlated both with the a. 
rating assigned each of these 25 men a 
with the mean of the other two ratings $i 
them. The results are shown in Table 2+ k 
The differences between the means F 
variances of the two samples of ratings “rhe 
not significantly different from zero. 
analyses showed, however, that the mea” |, 
the ratings assigned by the two raters W i 
were in closer agreement correlated less WH 
the predictors than did the ratings assig” 
by the rater who disagreed. od 
The same sort of analysis was perfor 
using the entire combined MD and HD § d 
ples. Again only the TC ratings were u 
Tn this case, of course, the agreement bet" 4 
the “two agree” raters was not as great, © ) 
in some cases the rating of the one “disag” je 
rater was not much farther removed fro™ * 


j 
mean of the three ratings than the 1% 
given by one of the “two agree” raters: „gg 
shown in Table 3 co 


» the two sets of oe 
were almost e z cH 


dis- 


qually predictable from 
However, scores on the GCT and M 
variables correlated significantly (.05 ev 
with the more deviant rating and not 


Predictability of Ratings 63 


Table 3 


Correlations Between Scores on the Predictor 
Variables and the One Disagree and the 
Mean of the Two Agree Ratings 


(N = 50 ratees, combined HD and MD samples, 
TC ratings only) 


Ratings SSCS GCT MECH 
One disagree 43% .30* 29* 
Two agree 44** .19 25 


* Significant at .05 level 
* Significant at .01 level. 


the mean of the ratings assigned by the two 
raters who were in closer agreement in their 
evaluations. 


Discussion 


The results showed that ratings of ship- 
board performance for which interrater agree- 
Ment estimates were high were less predictable 
from scores on two aptitude tests and school 
achievement than were ratings for which the 
interrater agreement estimates were moderate 
and low. They indicate that high interrater 
agreement does not necessarily imply predict- 
ability in performance ratings and that in 
Some instances interrater agreement and pre- 
dictability would yield incompatible indica- 
tions of the validity of ratings. , 

Whether or not the results can be inter- 
Preted to mean that interrater agreement is 
not necessarily a good index of the validity 
of ratings depends on whether or not one Is 
Willing to assume that the predictor variables 
employed in the study are positively related 
to the ultimate criterion of the performance 
that was rated. It is interesting to note, how- 
ever that SSCS which had been shown to be 

e single variable most highly related to 
Various other criteria of shipboard perform- 
ance (ratings, check lists, and practical per- 
formance and job knowledge tests) was also 

€ variable that showed the most significant 
Positive relationships with the ratings for 
eich the interrater agreement estimates were 

W, 

Ghiselli and Brown (1948), in summarizing 
wÈ bases of unreliability in ratings, state, 
The indication is that raters disagree P- 
Marily because they observe the individuals 


to be rated in different situations and under 
different conditions, and because they use 
different criteria for judging the same trait 
or characteristic.” The two factors that they 
say contribute to a lack of agreement in 
ratings were assumed in the development of 
the hypothesis tested here to contribute to 
their validity, as long as they are accurately 
reflected in the ratings. They continue by 
saying, “It follows from this evidence that 
reliability of ratings can be considerably in- 
creased by having the raters observe the indi- 
viduals under similar conditions, and by pro- 
viding techniques for making the ratings that 
will increase the likelihood that the traits or 
characteristics being judged will be evaluated 
on the same bases.” 

It is probably true that having raters ob- 
serve ratees under similar conditions would 
increase interrater agreement; it might also 
serve, however, to decrease validity for the 
ultimate criterion by failing to take into ac- 
count the variations in behavior that occur 
as a result of the changing conditions in the 
real on-the-job situations and the possible 
interactions between ratee performance and 
environmental conditions. 

Different members of a work group might 
react or perform well in one situation and 
poorly in another. Certainly with the variety 
of situations individuals face from day to day 
regardless of their occupations, they could 
not be expected to react consistently in all 
of them. Submariners, for example, live in 
a potentially threatening environment faced 
with the possibility of a tremendous variety 
of situations. In the operational environ- 
ment, officers and CPOs cannot observe their 
men perform either under similar conditions 
or in all situations, not only because of the 
physical layout of the boat but also because 
they have their own jobs to perform. Obser- 
vations of ratee behavior are of necessity 
almost chance occurrences. To develop a 
method whereby the ratees could be observed 
under similar conditions, even if it were possi- 
ble, would probably imply the exclusion of 
critical situations in which a man’s behavior 
would have potentially the greatest signifi- 
cance as far as his contribution to the effec- 
tiveness of the boat is concerned. The oppor- 
tunity of observing behavior in the critical 


64 Donald N. Buckner 


situations may be limited and at least 
partly dependent on chance; nevertheless, in- 
creasing interrater agreement by having raters 
observe ratees under similar conditions might 
defeat the more important purpose of obtain- 
ing valid ratings. 

With standardized environmental condi- 
tions such as Ghiselli and Brown suggest and 
with rater training so as to reduce the number 
of independent criteria raters employ in rating 
on particular traits, higher interrater agree- 
ment would probably imply greater predict- 
ability. Differences between the ratings as- 
signed the same men by different raters would 
probably reflect error variance. It is pro- 
posed, however, that such differences resulting 
from ratings being made in the entirely un- 
structured on-the-job environment may reflect 
real differences in ratee behavior and, thus, 
true variance. 

It is not being suggested that high inter- 
rater agreement always implies a lack of pre- 
dictability. Essentially none of the variance 
in the high agreement ratings and only a 
portion of the variance in the low agreement 
ratings was predictable from scores on the 
three predictor variables used in this study. 
It is conceivable that both of these sources 
of variation could be predicted from other 
types of measures. 


Summary 


The hypothesis tested was that high agree- 
ment among the ratings assigned the same 
men by different raters does not necessarily 
imply predictable ratings. 

Two groups of ratings, personal adjust- 
ment and technical competence trait ratings, 
made by three superior officers, officers and 
chief petty officers, of 100 submariners serv- 
ing aboard 21 different submarines were each 
divided into four samples so as to achieve 
four levels of interrater agreement: .94, .84, 


.69, and .00 for the technical competence 
ratings and .88, .90, .61, and .12 for the per- 
sonal adjustment ratings. Correlations were 
then computed within each sample between 
three predictor variables (Submarine School 
Class Standing and the Navy General Classi- 
fication and Mechanical Aptitude Tests) and 
the mean of the three ratings assigned to 
each ratee. 

The hypothesis was supported by the re- 
sults. None of the 12 correlations between 
the predictor variables and the ratings for 
which the interrater agreement estimates were 
high (.94, .84, .88, and .90) was significantly 
different from zero. Six of the 12 correlations 
computed for the low agreement ratings (.69 
.00, .61, and .12) were significantly different 
from zero, two at the .01 level and four at 
the .05 level. Three of the six significant 
correlations were with the ratings for which 
the interrater agreement estimates were not 
significantly different from zero, .00 and .12- 

It was concluded that high interrater agree 
ment does not necessarily imply predictable 
ratings and may in some instances indicate # 
lack of predictability. 


Received May 6, 1958. 


References 


Ghiselli, E. E., & Brown, C. W. Personnel and in- 
dustrial psychology. New York: McGraw-Hill 
1948. 

Guilford, J. P. Fundamental statistics in psycholog! 
and education. (2nd ed.) New York: McGraw” 
Hill, 1950. 

Guilford, J. P. Psychometric methods.. (2nd ed) 
New York: McGraw-Hill, 1954. N 

Mackie, R. R., Wilson, C. L., & Buckner, D. + 
Research on the development of shipboard per 
formance measures. Part V: Interrelationships H 
tween aptitude test scores, performance in m 
marine school, and subsequent performance 15. 
submarines as determined by ratings and +° h 
Los Angeles: Management and Marketing Resear? 
Corp., N8 onr 70001 and Nonr 1241(00), 1954 


Journal of Appli 
Vl St ee 


CONDUCTANCE LEVELS DURING VIGILANCE TASK 
PERFORMANCE * 


SHERMAN ROSS, JOSEPH DARDAN 


0,2 anp RAY C. HACKMAN èë 


University of Maryland 


Be pee of Mackworth (1950) has stimu- 
from hae ga of “vigilance” behavior 
the to points of view. For example, 
the ph ryland group has been interested in 
and Sate pe from the viewpoint of stress 
tenbur igue (Andrews and Ross, 1955; Whit- 
neat Ross, and ‘Andrews, 1956). Other 
a have been concerned with the 
tittet eristics of the signal such as rate of 
1953) presentation (Deese and Ormond, 

mi intensity and duration of the signal 
the si s, 1956), and changes in sensitivity to 
to — (Bakan, 1955). Recent approaches 
the A problem of vigilance may be found in 
R i of the 1956 Psychology Section 

ee of the British Association for the 
eerie of Science (Mackworth, 1956; 

ish Association, 1957)- 

a performance functions P 
isr worth are based on averaged data. 
and e is wide variation among individuals, 
tig do not exhibit decrement 1m per- 
5 lance. Individual variation is unrelated 

visual acuity, Although the need for indi- 


vi ae 
‘dual predictors has been stressed (British 


eh pce! 1957), no characteristic has been 
Di which correlates highly with vigilance 
erformance. 
ee the decline of performance has been 
ea in terms of a state of drowsiness 
Situ; ced by the monotony of the monitoring 
fin a (Bakan, 1955), am effective dimen- 
ane for differentiating individual perform- 
serik might be “level of activation” as de- 
pl. ibed by Schlosberg (1954). This concept 
an all emotional behavior 0” @ continuum 

nging from sleep to extreme excitement. 


In cond i - of activ: tion 
u ex 0 activa’ 
SS ae ctance as an ind 


resented by 


1 
T The authors would like to express their thanks to 
. Andrews for his aid with the analysis of the 


results 
of thi i to T. A., Hussman 
is experiment, am n of the 


and constructio! 


M Proving Grounds, Aberdeen, Maryland. 
of Pittsburgh. 


level has been related to such variables as 
hand steadiness and reaction time (Schlos- 
berg, 1954). Two recent studies also find 
conductance related to the individual's per- 
formance. Hussman and Hackman (1955) 
found significant inter- and intra-individual 
differences between GSR and flying perform- 
ance measures of pilots. Pilots who per- 
formed better generally exhibited a higher 
GSR. Parker and Hackman (1955) found 
that the GSR level of Naval pilots while 
viewing statements concerning flight safety 
procedures was related to flying skill. 

The present study used change in basal 
conductance as the indicator of activation 
level and was undertaken as a preliminary 
study to deal with the following questions: 
(a) Does the conductance change during a 
vigilance task exhibit a systematic trend? 
(b) If changes in conductance do exhibit a 
systematic pattern, will the pattern be unique 
for each individual or descriptive of all Ss? 
(c) Is the conductance trend related to effi- 


ciency of performance? 


Procedure 

d to record skin conductance 
is a microammeter calibrated to read conductance. 
(See Fig. 1.) The meter reads directly the current 
passing through the hand which, in this circuit, 


The apparatus use 


por SWITH 


PLUG 


TO 
TRIPLET PALM 
MOD. 321 CONTACTS 


MALLORY RM-I2 


CONDUCTANCE METER 
diagram of the conductance meter. 


Fic. 1. Schematic 


66 


equals conductance. This relationship holds because 
the applied voltage is adjusted to 1 volt, thus I 
equals 1/R. The reciprocal of resistance is conduct- 
ance. The 28.6 K resistor is used as the standard 
resistance of an S in order to obtain the desired 
calibrated current to apply. The S’s resistance is 
introduced into the circuit by inserting the plug 
containing the leads from the palm electrodes into 
the jack, which substitutes S’s resistance for the 
28.6 K resistor. The instrument has two scales de- 
pending on the position of the double pole double 
throw switch (D.P.D.T.). When the switch is in 
the X, position, the meter face scale reads directly 0 
to 50 microamperes. When the switch is on Xs, the 
1.8 K resistor is then in parallel with the meter move- 
ment, and the meter reads from O to 100 micro- 
amperes. Copper electrodes and electrode paste were 
used for the palm contacts. 

The Ss used were six men and three women stu- 
dents at the University of Maryland. Each S had 
20/20 visual acuity corrected or uncorrected. A 
modified form of the Mackworth “clock” apparatus 
was used (Whittenburg et al., 1956). The S was re- 


S. Ross, J. Dardano, and R. C. Hackman 


quired to detect double jumps of a clock pointer 
which made discrete jumps every 2 sec., and to re- 
spond to such jumps by pressing a switch. 

The same number and temporal pattern of double 
jumps employed by Mackworth was used. Double 
jumps occurred at .75, 1.5, 3.0, 5.0, 7.0, 8.0, 13.0, 
14.0, 15.0, 17.0, 20.0, and 30.0 min. during each half 
hour period. The four half-hour periods followed 
each other without interruption. During the 2 hr. 
session there were 7200 pointer movements, of which 
48 were double jumps. The performance measures 
were errors of omission (failures to detect a double 
jump) and errors of commission (responses when 
double jumps did not occur). 

The 12 X 10 ft. testing area was enclosed by parti- 
tions, 8 ft. tall. S sat 7 ft. from the clock, which 
fitted into the forward partition at a height of 3.5 ft. 
Palm contacts were placed on the front and back of 
the left hand. The S rested his left forearm and 
hand on the wide left arm of the chair during the 
2 hr. session. The response switch was mounted 0? 
the right arm of the chair, Positioned so it could be 
pressed by S while his arm was in resting position. 


Table 1 
Microampere Readings at 5 Min. Intervals for Nine Ss During Clock Test 


Subjects 

Time 1 2 3 4 5 6 7 8 9 
0 — — — 41 35 40 123 43 112 
5 44 24 147 36 35 44 172 42 116 
10 52 16 151 33 54 38 161 40 130 
15 56 24 151 41 102 32 137 40 126 
20 53 18 147 43 109 30 112 42 123 
25 62 18 147 42 109 26 147 41 130 
30 60 22 144 24 112 28 140 52 123 
35 62 21 137 26 105 44 147 50 126 
40 63 30 137 45 105 32 158 48 126 
45 66 27 133 31 102 32 147 52 126 
50 66 26 130 25 98 40 144 52 126 
55 69 46 130 37 91 26 144 50 130 
60 66 50 130 29 84 40 151 48 116 
65 69 38 123 27 81 40 137 54 112 
70 84 60 116 42 74 32 151 58 112 
75 84 60 123 36 74 38 151 56 109 
80 81 50 116 24 74 41 144 50 109 
85 84 54 116 24 74 42 140 52 112 
90 88 56 116 27 74 39 140 52 112 
95 91 74 105 38 74 50 137 54 105 
100 91 76 98 39 70 41 140 62 102 
105 95 70 95 35 67 50 144 64 102 
110 95 62 88 31 60 39 151 62 98 
115 98 68 84 36 56 41 147 64 98 
120 98 76 74 41 56 35 147 64 95 
Mean 72 43 125 34 80 38 145 51 116 
Range 54 60 77 21 77 24 60 24 35 


Conductance Levels During Vigilance Task Performance 67 


MICROAMPERES 


Fic. 2 

the curve Conductance change for e: 

tiie oa Ee e ordinate. 

0, and ced dotted lines an 

half-h 20 min, are total errors 
our period. 


Er 
< remain y 
Sion, ed outside the testing area during the ses- 


Cond 
ucta ; me 
tance readings were taken at 5 min. inter- 


Ki S, * 
jon, m Me time of the double jumps and, in addi- 
readings at and 15 sec. after the double jumps. The 
Were idad and immediately after the double jumps 
Critical uded to determine the effect, if any, of the 
5 stimulus on conductance. 
ouble an practice preceded t 
Were mps occurred at 0.5, 1, 25 
relation pointed out by the Æ to em 

to the single jumps. 


Th Results 
by all tse 4 of double j 
A nine eae first half-hour, 57 
the he 14% in the third, and 207 M 
5 mi alf-hour. The conductance readings 
Table o intervals for each S are shown in 
of trend In order to index the consistency 
esting of conductance changes during the 
session, autocorrelations wer? com- 


umps omitted 
% 


d circles indicate 
of omission an 


p. Sis identified by number above 


ach S including cluster membershi 

The solid curve and points indicate Cluster 1; dashed lines and x’s 
Cluster 3. 

d commission made by t 


The numbers above each curve at 30, 60, 
he S during the preceding 


puted for the sequence of 25 observations on 
each S. There were 24 readings in the se- 
quence for Ss 1, 2, and 3 due to the lack of 
a reading at the start of the testing session. 
Successive pair in the sequence were corre- 
lated, i.e., 1 with 2, 2 with 3, etc. The cor- 
relation coefficients for the 9 Ss were as fol- 


tain conductance indices m 
of a given tim 


first 20 min. 0 


f the testing session were dis- 
carded to eliminate changes due 


to initial ad- 


justment to the testing situation. The re- 
maining readings were grouped into triads 
taken. The 


and the average of each triad 
obtained value spans 10 min. In these suc- 
cessive averages, the last reading in the triad 


is included as the first reading in the next 


68 S. Ross, J. Dardano, and R. C. Hackman 


group. The rank order correlation of the 
conductance readings of each S with every 
other S was then computed. A cluster analy- 
sis was performed on this correlation matrix, 
and three distinct clusters emerged: Cluster 
1: Ss 1, 2, 6, 8; Cluster 2: Ss 3, 5, 9; and 
Cluster 3: Ss 4, 7. Fig. 2 summarizes these 
results. 

Cluster 1 is characterized by a gradual in- 
crease in conductance during the testing ses- 
sion, and Cluster 2 by a consistent decline 
over the 100 min. interval. The direction of 
these curves is seen to be associated with 
conductance level at the 20 min. point. The 
conductance of all Ss in Cluster 1 at 20 min. 
was lower than the conductance of the Ss in 
Cluster 2. It is noted that the Ss in Cluster 
2 maintain a higher conductance level until 
the 70 min. point. After this time three of 
the ascending functions exceed one or more 
of the descending functions. This apparent 
converging of those conductance levels which 
did change is shown in the decreased range of 
conductance readings for these clusters, from 
126.7 at 20 min. to 58.7 after 120 min. Clus- 
ter 3 consists of two Ss whose conductance 
trend is represented by straight lines. A finer 
time analysis revealed these as approximately 
cyclical. 

The average numbers of errors of omission 
and commission within each cluster were as 
follows: Cluster 1, 8.0; Cluster 2, 5.7; and 
Cluster 3, 7.5. These results suggest that 
high conductance may be associated with bet- 
ter performance. To examine this relation- 
ship further, two groups were formed for each 
half-hour period according to the conduct- 
ance level of the Ss at the end of each period. 
The “upper” group included the four Ss with 
the four higher conductance readings; the 
“lower” group included four Ss with the four 
lower readings. The S with the median con- 
ductance level for each period was not in- 
cluded. For any one period, the groups could 
be composed of different Ss. Table 2 com- 
pares the performance of these groups. The 
differences between total errors were in the 
expected direction, but were not statistically 
significant. 

The total of 15 errors of commission had 
the following distribution for the half-hour 


Table 2 


Comparison of the Performance of Groups Based on the 
Four Higher and Four Lower Conductance Levels 
at the End of Each Half-Hour Period 
Errors include omission and commission 


Half-Hour Periods 


Group 1 2 3 4 Total 
Upper 2 5 6 7 20 
Lower 12 1 8 15 36 


time periods: 11, 1, 1, and 2. This agrees 
with a previous observation (Adams, 1956) 
that most false reports occur early in the ses 
sion, but with time the irrelevant stimuli are 
better discriminated. Ss 1, 2, 4, and 6 who 
were responsible for the 11 errors in the first 
session had conductance readings of 60 mi 
croamperes or less during the first half-hou' 
The Ss with the four higher conductance lev- 
els during that period had no errors of com 
mission. This result does not support the 
view that the more excited Ss are prone t0 
react to irrelevant stimuli. 


Discussion 


The consistency of the individual conduct 
ance trends show a regular and continuou’ 
readjustment process by the S to a prolonge” 
and monotonous task. Six of the nine aui 
correlations exceeded .80 which indicates th 
conductance change is related to monitoring 
time. Two of the remaining trends can 
considered consistent since they described # 
cyclical pattern during the session. eal 

These individual trends can be classifie’ 
into three types of reaction: (a) an increas 
ing basal conductance which suggests that 
the S expends greater effort in order to oe 
main vigilant, (b) a decreasing basal con 
ductance which can be interpreted as an a 
ability by the S to maintain a high caer 
of vigilance, (c) fluctuation of conductan¢ 
around an average level suggesting continuo" 
compensatory efforts by the S to maintain 
given level of vigilance. To account for g 
pattern like that of S 7, allowance must t 
made for the degree to which an individu” 
will tolerate the discomfort induced by mai” 
taining high efficiency. 


Conductance Levels During Vigilance Task Performance 69 


These generalizations assume that high 
conductance level is related to high vigilance. 
Although the data suggest this relationship, 
no statistically significant relationship was 
found. 

Broadbent (1953) has pointed out that in- 
terruptions during the session prevent the 
early rapid decline in performance. Such an 
effect may have occurred in the present study 
due to the conductance hook-up or irregular 
external auditory stimuli, but efforts were 
Made to prevent any such interference. 

A GSR of a few milliamperes occurred for 
a Ss when the double jump was presented. 
i e amplitude and frequency of the deflec- 
lon decreased as the session continued. In 
a few instances such deflections occurred 
when the Ss did not detect the double jump. 


Summary 


pe or ductance during a vigilance task and its 
ionship to performance was investigated. 
ea raris and procedure were similar to that 

by Mackworth in his “clock” test. Six 
men and three women students were used 
as Ss, 

The conductance trends over the two-hour 
Roe formed three clusters: ascending in 
igs Ss, descending in three Ss, and cyclical 

two Ss. No significant differences were 
found between the performances of these 
oe clusters nor between high and low con- 
ie groups. The results suggest, how- 
ci er, that higher conductance level is asso- 
‘ated with better performance. 

t Eleven of the 15 errors of commi 
Urred during the first half-hour. 


ssion 0C- 
None of 


these were made by the Ss with the four 
higher conductance levels. 


Received May 7, 1958. 


References 


Adams, J. A. Vigilance in the detection of low-in- 
tensity visual stimuli. J. exp. Psychol., 1956, 52, 
204-208. 

Andrews, T. G., & Ross, S. Summary report on 
studies of behavior efficiency. College Park, Md.: 
Univer. of Maryland, 1955. (Project No. DA- 
49-007-MD-222) (O.I. 19-52). 

Bakan, P. Discrimination decrement as a function 
of time in a prolonged vigil. J. exp. Psychol., 
1955, 50, 387-390. 

British Association for the Advancement of Science, 
Psychology Section. Vigilance. The advancement 
of science, 1957, No. 53, 389-410. 

Broadbent, D. E. Noise, paced performance, and 
vigilance tasks. Brit. J. Psychol, 1953, 64, 295- 
303. 

Deese, J., & Ormond, 
during continuous visual search. 
Rep., WADC-TR-53-8, 1953. 

Hussman, T. A, Jr. & Hackman, R. C. The rela- 
tionship between psychogalvanic activity and 
pilot performance under simulated instrument fly- 
ing conditions. Bethesda, Md.: Naval Med. Res. 
Inst., 1955. (Res. Rept. Proj. Nm 001 056.08.02.) 

Mackworth, N. H. Researches on the measurement 
of human performance. London: His Majesty’s 
Stationery Office, 1950. (Medical Res. Council 
Spec. Rep. Ser. No. 268.) 

Mackworth, N. H. Vigilance. 
1375-1377. 

Parker, J. Fo Jrs 
tion of a criterion 0 


E. Studies of detectability 
WADC Tech. 


Nature, 1956, 178, 


& Hackman, R. C. The predic- 

f flight safety in Naval avia- 
tion. Bethesda, Md.: Naval Med. Res. Inst., 1955. 
(Res. Rept. Proj. Nm 001 056 08.01.) 

Schlosberg, H. Three dimensions of emotion. Psy- 
chol. Rev., 1954, 61, 2, 81-88. 

Whittenburg, J. A. Ross, S., & Andrews, T. G. 
Sustained perceptual efficiency as measured by the 
Mackworth “Clock” test. Percept. Mot. Skills, 


1956, 6, 109-116. 


Journal of Applied 


Psychology 
Vol. 43, No. 1, 1959 


5 


VOCATIONAL INTERESTS OF NAVAL AVIATION 
CADETS: 


FINAL RESULTS ° 


ROBERT B. VOAS 3 


U. S. Naval School oj Aviation Medicine, Pensacola, Florida 


Motivation for flying is an important 
requisite for success in the flight program. 
Unfortunately, most of the young men who 
enter flight training have had little experi- 
ence with flying, and, therefore, there is little 
on which to evaluate their desire to fly. This 
study attempts to determine whether the in- 
terest patterns of students who complete 
training and students who fail differ in such 
a way as to permit the use of a vocational 
interest test as a selection device. 

The Kuder Preference Record: Vocational, 
Form BM (KPR) (Kuder, 1946) is a stand- 
ard interest inventory which has been used in 
previous attempts to predict flight training 
success. Cerf (1947) reported that the in- 
ventory failed to predict success in Army Air 
Force training during World War II. How- 
ever, Rosenberg and Izard (1954) compared 
the scores of 137 naval aviation cadets who 
took the KPR after leaving the training pro- 
gram with 137 cadets who were still success- 
fully pursuing their training after nine months 
in the program. They found that the un- 
successful cadets had lower scores on the 
Mechanical and Scientific scales and higher 
scores on the Persuasive, Literary, and Mu- 
sical scales. The present study is a follow-up 
of the work of Rosenberg and Izard designed 
to determine the usefulness of KPR as a pre- 
dictor of success in the fight program when 
administered before training begins. 


1The data upon which this report is based have 
been reported under the title: Inventory testing of 
vocational interests of naval aviation cadets: Final 
results. U. S. Naval School of Aviation Medicine 
Research Report No. NM 14 02 11.01, April 1957. 

2The opinions or assertions contained herein are 
the private ones of the writer and are not to be con- 
strued as official or reflecting the views of the Navy 
Department or the naval service at large. 

8 Now at the Naval Medical Research Institute, 
Bethesda, Maryland. 


70 


Procedure and Results 


In addition to the two groups described 
above, Rosenberg and Izard (1954) adminis- 
tered the KPR to 16 classes of entering 
cadets. The test was administered as patt 
of the check in procedure during the stu- 
dent’s first week in the training program: 
The records of 605 of these cadets were avail- 
able for the present study. Of the 605 ca; 
dets, 465 successfully completed training 
group); 74 withdrew from the program at 
their own request (W group); 34 failed ™ 
some portion of flight training (F group); 
and 32 were eliminated for medical or miscel- 
laneous reasons (M group). Besides co” 
paring these groups on the standard KPR 
scales, a special scale was constructed whic 
would reflect the differences which Rosenberg 
and Izard found between successful and with- 
drawing cadets. The test papers of the 13 
students tested after leaving the program a” 
the 137 successful cadets tested after nin? 
months of training were analyzed. In those 
item triads which demonstrated differences }” 
response significant at the P = .01 level ° 
better between these criterion groups, a scot? 
of “1” was assigned to the alternative or 2” 
ternatives most frequently marked either as 
most or least interesting by the cadets in th® 
withdrawal group. A score of “0” was 2% 
signed the responses most frequently marke 
by the successful cadets. In this way an 80- 
item voluntary withdrawal (VW) scale W35 
constructed. A high score on the vw scalé 
indicated that the individual had an interes 
pattern similar to that of cadets who volu” 
tarily withdraw from flight training, while * 
low score indicated interests similar to SUS 
cessful cadets. The mean VW scale score f0 
the 137 successful cadets was 25.70 with 
SD of 12.07, while the mean score for th® 
137 withdrawing cadets was 39.86 with 2” 


2% OO 
r Oa O S 


Vocational Interests of Naval Aviation Cadets 71 


Table 1 


Means on the Kuder Scales for Training Criterion Groups by Time of Testing 


Pretest Results Concurrent Test Results* 


Cadets Cadets All 
Successful _ Who Who Other | Total Gaaais Corre- 
Codes. Withdrew Failed Attritions, Aurion Successful Who enh 
(S) Group (W) Group (E) Group (M) Group Groun, Cadets Withdrew “MeT 
Kuder Scales Med Noi Nes Na N iso N=137 N=137 N =605 
1. Mechanical 
anica 77.68 2 7 78.2 5 7: 
3 Computational yee ee pa S2 B2 Sth 
© Pee. 69.15 67.74 69.53 68.88 68.42 61.094 1250 
B AA 70.78 74.91 72.00 72.06 73.93 82.340 ar 
6. Littecare 51.58 53.56 50.69 51.85 53.08 50.44 02 
2, Musial 40.91 43.910 38.63 41.11e 35.50 41.12 —09 
8; Social Servi 19.57 19.76 20.25 19.77 17.19° 19.74 ‘02 
9; Claiesy e 69.18 62.85 66.47 67.01 65.71 61.31 —113° 
10. Voluntary: With- 39.78 41.82 42.81 40.97 42.72 44.884 — 07. 
drawal Scale 32.36" 31.59 32.56 32.22° 25.70% 39.864 —.30° 


r Spo reference: Rosenberg & Izard, 1954, 

e Sign Hieantlyt different from S group at the P < .05 level. 

ignificantly different from S group at the P < ‘01 level. 
t the P <.01 level. 


° Significantly different from W group al 
ignificant at the P < 01 level. 


D n 16.10. This difference produces a bi- 
aha aie of .56 between the VW scale 
atietan successful-withdrawal criteria for the 
a standardization group. This figure 
woua T chance differences 50 that shrinkage 
i be expected on cross validation. 
eans on the nine standard scales and the 


scale were computed for each of the four 


tion Selection Battery. The largest relation- 
ships were found with the Mechanical Com- 
prehension test, a measure of the ability to 
visualize mechanical relationships. The cor- 
relations for this test are also included in 
Table 1. Since the voluntary withdrawal 
scale was significantly related to mechanical 
ability, an analysis of covariance on the VW 
scale scores for the total attrition versus suc- 


S W, F, and M) criterion groups- These ] 
ata, together with the scores for the two cessful group was carried out. The results 
maps studied by Rosenberg and Izard are of this analysis appear m Table 2. 
resented in Table 1. To determine the Te- , 

ptitudes Discussion 


etip of these interests to the a 
ing are important to success M flight train- 
iin ee were computed between the 
er scales and the tests of the Naval Avia- 


Re. Table 2 
Spits of an Analysis of Covariance on the VW Scores 
or the Successful and Total ‘Attrition Groups; 


With the Effect of the MCT Held Constant 


o be considered was the 
extent to which the differences between the 
S and W groups on the standard KPR scales 
corroborated the findings of Rosenberg and 
Izard. They found that the voluntary with- 
als were significantly lower on the Me- 
Scientific scales. In the pres- 
ent study the withdrawal group also demon- 
strated lower mean Mechanical and Scientific 
scores; however, the difference for the Scien- 


The first question t 


draw: 
chanical and 


Sum of Š pee 
Squares af tific scale was not statistically significant. In 
Sourc Errors © 2 lier study the withdrawals obtained 

© of Variati s$ the earlier study i 
Tot to cc nae 2 higher scores on the Persuasive, Literary, and 
‘ 401,749.42 603 — Musical scales. In the present instance none 
Adjusted of these scales demonstrated statistically sig- 
Between Groups 579.27 579.27 nificant differences between the W and S 
ithin Groups 401,170.15 602 168.06 groups. Thus, only one of the five differ- 
ences reported by Rosenberg and Izard is 


fF =3.45,P > 05 


eS a 


confirmed by the present data. 


72 Robert B. Voas 


A second method of comparing the results 
from the present group with those of Rosen- 
berg and Izard was in terms of the VW score. 
The cadets who voluntarily withdrew from 
the training program did demonstrate a sig- 
nificantly higher VW score than the success- 
ful cadets. Thus, this score does have some 
validity for prediction of success in the train- 
ing program. However, the difference be- 
tween the S and W groups produces a biserial 
correlation of only .17 compared to the .56 
correlation found for the original standardi- 
zation group. 

The differences in vocational interest scores 
between successful and withdrawing cadets 
are larger at the time of separation from the 
program than if measured at the beginning 
of training. Comparison of the S group with 
the successful cadets measured after nine 
months in the program indicates that the 
pretested group had significantly lower Me- 
chanical and higher Scientific, Literary, and 
Musical scores. For the withdrawal group 
the pretested cadets gave lower Persuasive 
and Clerical and higher Mechanical scores. 
Whether these differences are due to varia- 
tions in the set under which the question- 
naire was taken or are due to changes in the 
underlying interest themselves cannot be de- 
termined. However, since the interest pat- 
terns as measured by the KPR change, va- 
lidity at the time of withdrawal gives little 
indication of predictive validity, 

An important consideration in the use of 
the KPR as a predictive measure is its rela- 
tionship to ability factors. The validity of 


this test is not limited to voluntary with- 
drawals. The F and M 


cally larger than those between the S and W 
groups but since the number of cases in these 


However, 
advantage can be taken of the general simi- 


significantly higher Literary scores than does 
the successful group. In addition, the VW 


scores for this pooled group were significantly 
higher than for the successful cadets. Thus, 
the VW scale constructed to detect voluntary 
withdrawals appears to be equally effective in 
detecting cadets who will fail for other Tea 
sons such as lack of ability. This finding 
suggests that the KPR interest scores are re- 
lated to the special abilities which are re- 
quired for success in naval aviation training. 

The data for the MCT given in Table ! 
indicate that this ability measure correlates 
significantly with seven of the 10 interest 
scales. Of particular importance is the .30 
correlation with the specially built VW scale- 
Since the MCT is the most valid single meas- 
ure of aptitude for flight training, the va 
lidity of the VW scale may be based on ifs 
relationship to mechanical ability rather tha? 
on the direct effect of the interest pattern ite 
self. The results of the analysis of covall 
ance which appear in Table 2 demonstrat? 
that with the MCT held constant the differ 
ence between the mean VW scores for bs 
successful and total attrition groups was 
Statistically significant, Thus, the validity ° 
the VW score appears to be based primarily 
on its relationship to mechanical ability. le 

The failure of the KPR mechanical 58 
to predict success in U. S. Air Force flight 
training when tests of mechanical abita 
demonstrated high validity led Cerf (1941) 
to suggest that the relationship between w 
chanical interests and mechanical ability A 
low. For the naval cadets sampled in oa 
study the relationship appears to be greate! 
and the interest tests demonstrate some V4 
lidity. Essentially, however, the pres 
study is in agreement with that of the 4! 
Force in Suggesting that vocational interes. 
inventories of this type are not of great value 
for predicting training outcome. Where ge 
terest tests do demonstrate some validity: 
ability measures can probably be found eget 
will cover the same variance and avoid p 
problem of faking which often invalidat? 
these questionnaires. 


Summary 


This paper reports a study of the validity 
of the Kuder Preference Record as a Pi 
dictor of success in flight training. This Í 


rr a 


Vocational Interests of Naval Aviation Cadets 73 


ventory was administered to 605 naval avia- 
tion cadets on entrance into flight training. 
Scores of the successful cadets were com- 
pared with cadets who withdrew or failed in 
the training program. The KPR demon- 
Strated small but statistically significant va- 
lidity for prediction of all categories of at- 
trition, However, when differences in me- 
chanical ability were controlled, this inventory 
did not show a significant relationship to the 
Pass-fail criterion. It was concluded, there- 
fore, that the vocational interests measured 
by this inventory do not have an important 
relationship to success in flight training ex- 
cept as they reflect the presence or absence 


of the special mechanical skills required in 
flying. 


Received May 9, 1958. 


References 


Cerf, A. Z. Personality inventories. In J. P. Guil- 
ford, & J. I. Lacy (Eds.), Printed classification 
tests. Rep. No. 5. Washington 25, D. C.: AAF, 
Aviation Psychol. Program, Res. Rep., U. S. Gov- 
ernment Printing Office, 1947. 

Kuder, G. F. Revised manual for the Kuder Prej- 
erence Record. Chicago: Science Res. Ass., 1946. 

Rosenberg, N., & Izard, C. E. Vocational interests 
of naval aviation cadets. J. appl. Psychol., 1954, 
38, 354-358. 


Journal of Applied Psychology 
Vol. 43, No. 1, 1959 


A REVISION OF THE STUDY OF VALUES FOR USE 
IN MAGAZINE READERSHIP RESEARCH * 


WARREN C. ENGSTROM anp MARY E. POWERS 


Curtis Publishing Company 


One of the most important goals in maga- 
zine readership research is the development of 
valid and reliable methods of measuring indi- 
viduals with respect to whatever psychologi- 
cal attributes we might reasonably assume to 
be related to the reading of magazines. This 
goal is based on the assumption that the men- 
tal and emotional activity involved is not 
haphazard nor random, but has measurable 
causes within the individual personality. 

Studies which related demographic data to 
readership represented the first attempts by 
mass communications researchers to find func- 
tional variables of magazine readership. This 
approach was found to be of limited useful- 
ness as the analysis of readership data pro- 
gressed from simple studies of item popularity 
to attempts to understand unique reading pat- 
terns. The search for more basic explanations 
led to the attempt to measure personality fac- 
tors, which, if they could be found to be dis- 
criminators of individuals and groups of indi- 
viduals with respect to the items they choose 
to read, could lay the foundation for more 
adequate explanations of magazine reading 
behavior. 

The Allport-Vernon Study of Values was 
selected as the experimental measurement for 
this study for the following reasons: (a) Its 
orientation tended to be similar to attitude 
and opinion studies commonly made of gen- 
eral populations; (b) It was not a clinical 
measurement; its approach to personality 
measurement was in the area of normal per- 
sonality; (c) Evaluative judgments, as used 
in the Study of Values, appeared to be closely 
related to the type of behavior involved in 
magazine reading. 

Some historical justification for expecting 
that the Study of Values would be effective 
in relating magazine reading to values did 


1 The authors are indebted to Herbert C. Ludeke, 
Manager Development Division, Research Depart- 
ment, The Curtis Publishing Company, for his guid- 
ance and support throughout this project. 


74 


exist. A. G. Woolbert showed that it was an 
effective predictor of recall of experimental 
newspaper items (Cantril & Allport, 1933; 
p. 265). Other early studies, reported by 
Cantril and Allport in 1933, demonstrated 
that general evaluative attitudes influence the 
activities of everyday life. 

The investigators were, of course, aware of 
the conflicting opinions about the Study o 
Values. The hypothesis was made, howevel 
that this measurement would validly and re 
liably differentiate groups of individuals with 
respect to the six values of the test, and that 
these values would be related to magazine 
reading. 


Revising the Study of Values 


In the form in which it is presently pub: 
lished, the Study of Values was found to E 
considerably above the vocabulary level E 
noncollege general populations found in E 
tional studies of magazine reading. In a4 4 
tion, certain items appeared to be too SP i 
cialized or of too limited interest to gene!™ 
populations, as they assumed a cultural lev® 
far above average. For the purposes of o 
study, a revision was clearly necessary- s 

The Curtis revision of the Study of Vali 
attempted to reduce the vocabulary and cl 3 
tural levels to that of readers of mass circuli 
tion magazines, while at the same time, t0 | 
as little violence as possible to the underly} 
design and wording of the original test. je 
investigators were at all times mindful of H 
necessity of modernizing and simplifying, D 
language of the test without substantia! 0 
changing its design or thought. It was 4 a 
found necessary to simplify the original sco 
ing and marking system, and to develoP 6 
new type of score sheet for office use. All oF 
revisions were made empirically through th 
combined experience of the investigators ” 
preparing, testing and analyzing survey q" 
tionnaires used with national samples 
magazine readers. 


ng 


Magazine Readership Research 75 


The following example will demonstrate the 
level of the revision: 
y Item 20, Part I: Original Version (Allport, 
ernon, & Lindzey, 1951a) 
a of the following would you consider the 
a important function of education? (a) its prepa- 
ae for practical achievement and financial re- 
a (b) its preparation for participation in com- 
nity activities and aiding less fortunate persons: 


Revised Version 
ie Hp schools should be: (a) to prepare stu- 
community good jobs; (b) to prepare students for 
y activities and helping others. 
fo Ce Curtis revision of the Study of Values 
ons een field tested in a variety of situa- 
the a Following extensive pretesting within 
was =a Publishing Company, the revision 
nk ag ministered as a house-to-house survey 
mi fae with noncollege housewives ìn a 
'ddle-class suburban Philadelphia neighbor- 
on in order to test the feasibility of the re- 
Res 5 ina typical field interviewing situation. 
visio ts of this field test indicated that the re- 
n was usable under these conditions. 
ka most rigorous field test to date was 
a he when it was administered to 300 Ss in 
aa. study to test the validity of the re- 
Shae Five methods of administering the 
ed of Values were used; the test was con- 
in ed both with individuals and with groups, 
both office and home situations. 
prese field testing procedures demonstrated 
is the Curtis revision of the Study of Values 
Bis Practical instrument under field condi 
in S, in that all five methods were successful 
fair ining respondent cooperation and in es- 
Via the understandability of the test 
K S, the scoring and the instructions. The 
Dans S Of the personal interview method, in 
tticular, indicates that the revision can be 
Sed on a national sample basis with maga- 


Zin 
€ reader audiences. 


Establishing the Reliability of 
_ Reliability tests were condu 4 
eee of Ss: the Junior Class of the Radnor 
anior High School of Wayne; Pennsylvania, 
The a group of new industrial employees of 
ss Curtis Publishing Company. Both groups 
of tee educational and cultural specifications 
the study: no S had received more than 


igh school education, 4 e high 


the Revision 
cted on two 


Ithough th 


school group contained many who planned to 
go on to college. The handicap imposed on 
the findings through the use of these groups 
was, of course, recognized at the time of their 
selection. The groups comprised individuals 
different in many ways from groups of mass 
circulation magazine readers found in national 
studies. If, however, the revision of the 
Study of Values proved to have acceptable 
reliability for the two experimental groups, 
it might be argued that the revision would 
be operating under less difficult conditions 
among adult readers of national magazines, 
and could be considered reliable for work 
with reader audiences. 

Both administrations to the high school 
group were made by the investigators them- 
selves, with a time lapse of four months be- 
tween the two tests. The time lapse for the 
industrial group was one month between the 
two tests, with the administration conducted 
by members of the Personnel Department of 
The Curtis Publishing Company. Table 1 
shows the test-retest correlations for the two 


experimental groups. 
lity coefficients shown in 


The repeat reliabi! 
Table 1 were somewhat lower than those re- 


ported by Allport, Vernon, and Lindzey for 
their 1951 revision of the Study of Values. 
A longer time period, however, was involved 
for the high school group in this current 
study than for the groups tested in the 1951 

The mean repeat reliability 


revision study. i 
coefficients were .83 for the high school group 


Table 1 


oduct-Moment Reliability Coefficients 


Test-Retest Pr 
for the Curtis Revision of the Study of Values 


Test-Retest Reliability 


Coefficient 

Student Industrial 

Group? Group” 

Value (N = 77) (N = 58) 
Theoretical 81 11 
Economic 85 73 
Aesthetic 87 By 
Social 719 80 
Political .80 86 
Religious .85 S81 


ests: four months. 


ween ti 
one month. 


ime lapse bet 
aE p ween tests: 


b Time lapse bet) 


76 Warren C. Engstrom and Mary E. Powers 


Table 2 


Split-Half Reliability Coefficients for the Curtis 
Revision of the Study of Values 


Split-Half Reliability 


Coefficient 
Student Industrial 
Group Group 
Value (N = 77) (V = 58) 
Theoretical 85 57 
Economic 73 31 
Aesthetic -10 -66 
Social A8 58 
Political 4 A7 
Religious 86 -63 


Note.—Reliability coefficients of whole test, calculated by 
applying the Spearman-Brown prophecy formula to the corre- 
lation between split halves. 


and .78 for the Curtis industrial employees, 
using a z transformation. These mean co- 
efficients may be compared with the mean 
of .89 for the Allport, Vernon, and Lindzey 
(1951b) revision. 

The student group, as shown in Table 1, 
had higher test-retest reliability than the in- 
dustrial employees, despite the fact that the 
time interval between the two administra- 
tions of the revision was four times longer for 
the student group. These higher reliability 
coefficients are all the more noteworthy since 
nonadult, immature Ss would be expected to 
show less stability on the Allport-Vernon 
values than the adult, nonstudent group of 
Curtis employees. 

Split-half reliability was tested by divid- 
ing the revision into two subscales, the sub- 
scales being composed in such a manner that 
there was approximately the same number 
of pairings between the value under study and 
all remaining values. Table 2 shows the split- 
half reliability coefficients for the two experi- 
mental groups. 

The mean reliability coefficient, using a z 
transformation, was .72 for the group of 
Radnor High School students and .57 for the 
group of industrial employees. This com- 
pares with a mean reliability coefficient of 
82 for the Allport, Vernon, and Lindzey 
(1951b) revision of the Study of Values. 
Split-half reliability coefficients were lower 
than the repeat reliability coefficients, as was 
also true for the Allport, Vernon, and Lindzey 


(1951b) revision. One possible reason E 
the lower split-half reliability coefficients | 
that the Study of Values does not contait 
simple items dealing with only one valu 
pairings of values take place in every itea 
Random methods of selecting items 
equivalent forms of the test cannot be K 
plied; the investigators had to apto n 
the pairings of values for the items selec 
for split-half reliability testing. the 
The investigators were satisfied that K 
results of these tests of reliability dem? f 
strated that the Curtis revision was BE 
ciently reliable to warrant further work Wil 
the test, even though it might be argued a 
the reliability figures would have been eal 
what higher if reliability coefficients had ‘at 
established for groups of adults more sim 4 
to those found in national populations 
magazine readers. f the 
As far as the discriminative power 0 out 
items in the revision was concerned, 96 dis 
of the 120 choices in the test successfully k 
tinguished between Ss whose score indica 


Lae the 
Analyzing them further, it K 7 
found that only one item was totally wor 
less, in that neither choice was diagnostic: the 
general, the 24 failures were accepted by re 
investigators as a limitation of the Curtis ag 
vision which did not materially impa! 


value with respect to the purposes for W 
it was designed. 


ich 


Testing the Validity of the Revision 


5 8 
The hypothesis that the Curtis revisio ct 
the Study of Values has validity with resP a3 
to the relationship between the six values r 
measured by the revision and expressed icles 
est in reading magazine stories and arti 
was tested by means of a study conducte is 
300 Ss (150 men and 150 women) i” 
Eastern cities: New York, Trenton, ve i. 
town, Providence, Columbus, and Cincin” to 
The Curtis revision was administere i 
each S, along with a questionnaire on rea" 4- 
interests containing a list of titles of 35 = pe 
zine nonfiction articles and 33 magazine y 


jen" 


Magazine Readership Research 77 


short stories. The titles were designed to 
Cover 10 nonfiction and 11 fiction topics, with 
various themes and appeals within each topic. 
Interest in reading each of these items was 
registered by means of a thermometer scaling 
device, with high and low temperatures indi- 
cating high and low interest in reading the 
items. This thermometer scale had previ- 
ously been validated as a predictor of reading 
interest and other behavior. (At the same 
time, attitudes toward several national maga- 
zines were studied by means of semantic dif- 
ferential tests, the results of which are be- 
yond the scope of the present report.) In- 
terviewing for this study was conducted by 
the Alan C. Russell Marketing Research or- 
Sanization. 

Two questions to be answered by this study 
Were, first, whether or not individuals scoring 
high on a given value differed significantly 
from those not scoring high on the value with 
respect to interest shown in story and article 
titles; and second, if there were differences, 
how well did reading interests correspond to 
the values as measured by the test? For the 
Purposes of this analysis, individuals were 
Considered “high” whose score for a value 
Was plus one standard deviation from the 


Mean for the value; individuals were consid- 


cred to have shown positive interest in a title 
igher on the 


if they rated it 80 degrees or h 
termometer scale. Using these definitions, 
t erences in interest in each story oF article 
te e for individuals high in each value were 
sted for significance. , 
_ Results indicated that 29 of the 33 fiction 
tng and 32 of the 35 nonfiction titles were 
Soticantly different in interest among per- 
ie scoring high in the six values. Differ- 
ea significant at the 5 per cent level of 
a Mfidence or better were then examined 
8ainst the characterization of the six value 
re as described by Allport and Vernon in 
€ Manual of Directions for the Study of 
Glues, with the following results. _ 
he dominant interest of the «theoretical 
erson, according to the authors of the Study 
Rtg alues, is the discovery of truth; his inler 
tati are described as empirical, critical, gi 
int onal: he is seen as an intellectualist, with 
Pi in science (1951b).- In the current 
dy, those persons who scored high in the 


theoretical value indicated a significantly 
higher interest in reading two of the three 
articles on science. For the third article, 
dealing with science in relation to human 
happiness rather than with “pure” science, 
they were not significantly higher. They also 
showed higher interest than others in science 
fiction. On the other hand they were signifi- 
cantly lower than other people in interest in 
reading the articles on domestic arts and on 
religion. They showed lower interest also in 
fiction themes dealing with romantic or senti- 
mental aspects of love, home, and children. 
In all, the high theoretical people in this 
study showed significantly different interest 
in reading 14 of the 33 stories and 15 of the 
35 articles. 

The “economic” person as described in the 
Study of Values Manual of Directions is in- 
terested in the utilitarian, the tangible, the 
practical, and “conforms well to the prevail- 
ing stereotype of the average American busi- 
ness man.” In the present study, those per- 
sons scoring high in the economic value indi- 
cated significantly higher interest than the 
noneconomic people in all sports articles, and 
in fiction dealing with sports and the West. 
They were significantly less interested than 
others in articles dealing with the theoretical 
aspects of a topic, as opposed to the practical, 
“how-to” aspects, whether the topic was medi- 
cine, religion, science, or entertainment. In 
all, the high economic people showed signifi- 
cantly different interest in reading three of 
the 33 stories and 12 of the 35 articles. 

The “aesthetic” person, as defined by the 
authors of the Study of Values, is one who 
has a dominant interest in beauty, harmony, 
symmetry; he need not necessarily be crea- 
tive, but is highly interested in the artistic. 
Those scoring high in the aesthetic value in 
the present test indicated significantly higher 
interest than nonaesthetic persons in the two 
articles dealing with aspects of American cul- 
ture. They showed significantly less interest 
in reading the magazine-type fiction items in- 
cluded in this study: of the 33 fiction titles, 
high aesthetic scorers were lower in interest 
in 23. They also showed less interest in read- 


ing the nonfiction items than did nonaesthetic 
people: of the 35 nonfiction titles, they indi- 
cated significantly lower interest in 17. 


78 Warren C. Engstrom 
The “social” person, as described by the 
test authors in the instruction manual, is one 
whose highest value is altruistic or philan- 
thropic love of people, and who therefore 
tends to be sympathetic and unselfish. In 
the present study, those scoring high in the 
social value showed significantly higher in- 
terest in articles whose theme indicated em- 
phasis on help or service to mankind. They 
also showed significantly higher interest in 
articles that centered on the home and family. 
In fiction their preferences ran to themes of 
human relationships: romance, home, and 
children, as well as stories of personal rela- 
tionships against medical or business settings. 
They were significantly less interested than 
other persons in sports, whether fiction or 
nonfiction. In all, high social scorers dif- 
fered significantly from others on 17 fiction 
and 12 nonfiction titles, 

The “political” person, according to the 
test authors, is interested primarily in power, 
in all competition and struggle, not exclu- 
sively in politics. In the present study, those 
with high political scores expressed higher 
interest than others in reading articles on 
politics and on crime that dealt with con- 
flict. They were also significantly higher in 
interest in competitive sports in both fiction 
and nonfiction. In addition they expressed 
preferences for war fiction. They were sig- 
nificantly lower than nonpolitical people in 
interest in articles on religion and on do- 
mestic articles of the home-service variety. 
In all, they differed significantly from others 
on four fiction and 11 nonfiction titles, 

The “religious” person, as described by 
Allport and Vernon (1951b), is mystical, and 
has as his highest value unit » or the relating 
of himself to the cosmos as a whole. Those 
scoring high in the religious value in the pres- 
ent study showed significantly higher interest 
in all the religious articles included in the 
test. They were also higher than others in 
interest in articles dealing with some aspect 
of charity, and with family-service topics. 
In fiction they were significantly higher in in- 
terest for those stories of human relationships 
that seemed to be concerned with human 
problems. In all, high religious scorers dif- 
fered significantly from others on five fiction 
and 14 nonfiction titles. 


and Mary E. Powers 


In summary, the data showed many areas 
where there were plausible relationships be- 
tween the values as defined by the Alpo 
Vernon Study of Values and interest in rea 4 
ing the test items. The value scores were H 
markably effective in discriminating an 
the types of stories and articles of interest 
the various value groups. It should be nona 
that this study of validity was set up 45° 
pilot study prior to a proposed full-scale 14 
tional sample survey which would be 
necessary and final step in establishing i 
validity of the test with respect to the re f 
tionship between the values tested and inten 
est in reading magazines. From the eviden 
already on hand, however, it seems reaso a 
able to assume that the Curtis revision of 
Study of Values may be of considerable v4 io 
to readership researchers in their attempts 
understand reading behavior as it pertains 
magazines. 

Summary 


One goal of magazine readership reset 
is to develop measurements of psycholoft 5. 
attributes related to readership of maga” 
The Allport-Vernon Study of Values 
chosen for study, and was revised for 
with national samples of noncollege, oer 
populations, under the conditions of ae 
house field survey interviewing. The rev” e 
was found to fulfill at least the minimu™ ade 
quirements of reliability. A pilot study mye 
on 300 Ss tested the hypothesis than to 
Study of Values has validity with resp® on” 
interest in reading magazine fiction an 
fiction. Significant differences were fou? 


jp 
t 
TRE iffere” 
reading interest for individuals of differs, 


ye 
values, and the value scores were effect! eri 
discriminating among the types of wa io 
chosen for reading by different groups 
dividuals. 
Received July 31, 1958. 
Early Publication. 
References guh! 

Allport, G. W., Vernon, P. E., & Lindzey, G- mit” 

of values. (Rev. ed.) Boston: Houghton 

1951. (a) it 


Allport, G. W., Vernon, P. E., & Lindzey, G- 
of values manual of directions. (Rev. ed. 
ton: Houghton Mifflin, 1951. (b) sical! 

Cantril, H, & Allport, G. W. Recent apple pol 
of the study of values. J. abnorm. soc. PS 
1933, 28, 259-273, 


Journal of Applied Psychology 


VoL. 43, No. 2 


APRIL, 1959 


INTERACTIONS AMONG OPERATOR VARIABLES, 


SYSTEM DYNAMICS, AND TAS 


K-INDUCED 


STRESS* 


W. 


D. GARVEY ano F. V. TAYLOR 


Naval Research Laboratory 


Mpi reisn series of experiments compares 
trackin ormance of several man-machine 
under r systems when Ss are controlling 
when co normal unstressed” conditions and 
induced ntrolling under conditions of “task- 
1952) Ey (Lazarus, Deese, & Osler, 
if the r = experiments seek to determine 
systems elative efficiencies of two or more 
under “ are the same when they are operated 
Soe on conditions as when they are 
Stress Any conditions of task-induced 
Ments : brief description of the three experi- 
ollows: 


Experiment I 


A i . 
aided acceleration control and an acceleration” 
abs control system are compared in the 
co ence of stress and under a variety of stress 

nditions, 


Experiment II 

those used in Ex- 
Ss are selected 
s so that the 
n-aided sys- 
acceleration 


me same two systems as 
and ae are used; however, 
Poote ivided into two groups 
ie Ss control the acceleratio 
Pete the better Ss control the i 
if ao system. The purpose is to determine 
iff ressing the human element will produce 
erential deterioration in the performances 
icc systems which have been equated 
ough selection of Ss. 


Experiment Ul 
Ss on the two 


Since prolonged training of 
I and I indi- 


Syst 

Stems used in Experiments 
A 

er Tis authors wish to thank Jean B. Henson for 
sistance in running Ss and analyzing the data. 


79 


cated that the performance of these would 
probably never become equated through train- 
ing, this experiment employs the acceleration 
control system used in the previous experi- 
ments and a position control system. The 
experiment seeks to determine the differential 
effects of stress on the performance of two 
systems which have been equated through 


extensive training of Ss. 


Method 


procedure. The experiments were 
the three man-machine systems 
In each system, S serves as a 
d it is his task to keep a dot 
of a CRT centered on a stationary hairline by 
manipulating a joy stick control. Tracking is in 
only one dimension since the dot and joy stick are 
free to move only in the horizontal plane. The dot 
is forced off the hairline by a complex sine wave 
input (system input) which consists of two basic 
frequencies of 2.3 and 3.2 cycles per min.; the rela- 
tive amplitude of each sine wave is inversely pro- 
portional to its frequency. 

In the Acceleration Control System in Fig. 1, a 
f the stick produces an 18 mm,/ 
f the dot on the scope. The 
ded Control System is equivalent to 
System with the exception 


Apparatus and 
conducted with 
shown in Fig. 1. 
tracking element an 


that the respo’ 
the system, have 
of the error to S. A stick movement of 1° imparts 
a displacement of 4.5 mm, a ve 

sec. and an acceleration of 18.0 mm./sec.? to the dot 
on the scope. Jn the Position Control System, a 1° 
stick displacement produces @ 4.5 mm. change in the 
position of the dot on the scope. The performance 
of each system was measured in terms of absolute 
system error, integrated over the last 50 sec. of a 


60-sec. trial. 
Forty-eight na 
three experiments— 


men served as Ss in the 


val enlisted 
per experiment. Before 


16 Ss 


80 W. D. Garvey and F. V. Taylor 


ERROR 
(WITHOUT REGARD 
TO DIRECTION) 


CONTROL AND 
MECHANISM 


OUTPUT 


ACCELERATION CONTROL SYSTEM 


ERROR 
(WITHOUT REGARD 
TO DIRECTION) 


CONTROL AND 
MECHANISM 


INPUT 


ACCELERATION -AIDED CONTROL SYSTEM 


ERROR 
(WITHOUT REGARD 
TO DIRECTION) 


[man] 5 OUTPUT 
——— ee 


POSITION CONTROL SYSTEM 


CONTROL AND 
MECHANISM 


INPUT © 


Fic, 1. Sim 
l 


plified block diagrams of three manual tracking systems employed. 
The symbo 


© stands for algebraic addition, 
stands for integrator, and 


X stands for amplifier. 


each experiment all Ss who 
given 20 1-min. trials on ea 
under study. The error scor 
trials were used to select 


were to be used were 
Experiment I, 
es obtained during these 
Ss for the experimental 


ment. The Ss in 
In Experiment I, 16 Ss were 

of eight, matched in terms 
of the performance scores of both systems during mornin; 
the pre-experimental trials. For Experiment II, 16 told t 
new Ss were divided into two groups of eight, using their performance with each ‘rial. 
performance data from the pre-experimental trials At the end of th 
as criteria. The eight best Ss were assigned to experiments Ss wer 
operate the acceleration control system and the eight 


divided into two groups 
trials. 
g and one in the afternoon. 


Before each stress con 
aided system. In Experiment III, the remaining 16 


Ss were divided into two matched groups 
ch of the two systems 


Once assigned to a specific group, Ss operated 
£ one system throughout the remainder of the iv 
groups. The group assignments were as follows: all experiments received equ 
training, which consisted of 23 sessions of 10 in 
of means and variances Two sessions per day were given, one 


heir scores and encouraged to try to ip 


e 23rd training session in ie 
e required to operate the 
under a series of conditions of task-induce 
poorest Ss were assigned to operate the acceleration- dition Ss were given id 
sec. trials without stress, followed by an exp 


Operator Variables, System Dynamics, and Task-Induced Stress 81 


a how the next session would differ from the train- 
na at With the exception of the condition of 
each ne oe Ss were given 10 60-sec. trials under 
i the stress conditions. Each variety of stress 
as presented on a different day in the following 
order: 2 
aia Fealanged trials: The Ss were required to oper- 
Scores. § system continuously for 60 min. Error 
res were recorded at the end of every 5-min. 
sia during the hour. a 
ca display-control relationship: The 
tala g task was performed with a display-control 
ionship which was the reverse of that on which 
S were trained. 
Peters. tracking: Under this condition, Ss 
the seatited to track with their left hands. (During 
teng e-stress training they tracked with their right 
s.) 
ges tracking: Two targets were tracked 
One ae eet one with a right-hand control and 
was id a left-hand control. The left-hand system 
dots f eens with the right-hand system; the two 
Sinio owed the same course input, which was that 
N during the training sessions. Only per- 
afal nce with the right-hand control is used in 
alyzing the results. 
wore vo Coordinate tracking; The dot and stick 
requir ree to move in two coordinates and Ss were 
contr, i to track the target in both. The dot was 
tinue te with the right stick only. The course 
except o the system was that used during training, 
hone that its path of movement was rotated 45 
Ë the horizontal axis. Only the performance of 
anaa in the horizontal coordinate 1s used in 
yzing the results. 
this Secondary visual task: The tracking task during 
e onction was the same as that employed during 
atri sessions. In addition to this, Ss were 
in aoe to perform a second task which consisted 
teers Coe and reporting the range and bearing of 
obtain. on a simulated radar scope. A measure was 
as y ned of the accuracy of Ss’ reports of targets 
Well as their tracking scores. 
the Secondary arithmetic task: 
igit acking task, Ss were required to solve two- 
_Subtractions at a very rapid rate. In order to 
ain some measure of Ss’ arithmetic ability they 
ei cata the same arithmetic problems without 
cking on the following day. 


While performing 


We 
tr; 


Results 


ig ftformance is expressed i 
tint ee error scores. 

by the fact that under sever: 

en Bae some Ss lost the target before the 

Sign ot the full 60-sec. trial. On such occa- 

°ns, maximum scores of 100 were recorded. 

en a median score of 100 is presented, this 


n terms of me- 
This is necessi- 
al conditions 


2 eps 
Se For further description of the stress conditions, 


e 
Garvey (1957, pp. 3-4). 


z2 
[ EXP I 
16 — ACCELERATION 
--- ACCELERATION- 
12 AIDED 


8 Se 
> 4 Pa 
= 
S plese eet 
G F EXP I 
a_ 16 — ACCELERATION 
xl --- ACCELERATION- 
Q212 AIDED 
a) 
Ww 
ač 8 
ug 
TE 4 
Se 
wa O 
wg 
z 
= — ACCELERATION 
a --- POSITION 
a 2 
ž 
8 
4 
ie) 


Ir 2i 


t E. 9 $ 
TRAINING SESSIONS 


Fic. 2. System performance as a function of train- 
ing. The top graph shows the results from Experi- 
ment I; the middle graph, from Experiment II; and 
the bottom graph, from Experiment III. 


indicates that the targets were lost on 50% 
or more of the trials under that condition. 
Effects of training. The results of training 
in Experiment I are shown in the top graph 
of Fig. 2. The performance of the accelera- 
tion-aided control system is substantially bet- 
than that of the acceleration control 
Using the mean of all trials within 
n for each S as a datum, the difference 
erformance for each session was 
tested with Wilcoxon’s (1949) test for un- 
matched replicas and was found to be signifi- 
cant throughout the training sessions ($ 
< 0.01). 
The middle graph in Fig. 2 shows the re- 
sults obtained in Experiment II. These dif- 
ferences, when tested with Wilcoxon’s test, 
were not found to differ throughout the entire 
training period ($ > 0.25). 
The results of Experiment TII are shown in 
the bottom graph in Fig. 2. During the first 
seven sessions, performance of the accelera- 


ter 
system. 
a sessio 
in system p 


82 


W. D. Garvey and F. V. Taylor 


24 


R MINUTE 


(ARBITRARY UNITS) 


s 
L 
[ EXP. I 


6 

— ACCELERATION | 4 
ACCELERATION- 
AIDED 2 


MEDIAN INTEGRATED ERROR PE 


— ACCELERATION 
~~~ ACCELERATION- 
AIDED 


pug 
O 15 30 45 60 o 15 


EXP D 


— ACCELERATION 
==- POSITION 


L 
30 45 600 I5 30 45 60 


PERIOD OF CONTINUOUS TRACKING (MIN) 


Fic. 3. System performance as 
tracking. The left-hand graph 
middle graph, from Experiment 
ment III. 


tion control system was significantly poorer 
than that of the position control system 
(p < 0.05) when tested with Wilcoxon’s test. 
With the exception of Sessions 13 and 15, the 
differences between the two systems were not 
reliable throughout the remainder of the train- 
ing sessions (p > 0.05). 

Effect of stress. The graphs in Fig. 3 show 
the results of the 60-min. Continuous-tracking 


a function of 
shows the r 
II; and the 


Tabl 


System Performance Det 


Prolonged periods of continuous 
esults from Experiment I; the 
right-hand graph, from Experi- 


trials. The measure of performance is a 
in terms of median integrated error pet we 
for each 5-min. tracking period of the 60-™ 
trial. ri- 
Using Wilcoxon’s (1949) test for compa, 
son of several treatments, it was foun te 
Experiment I that performance deteriora., 
significantly for both systems with the h 
creased duration of the trial (p < 0.05). 


e1 


erioration Under Stress 


(Median Integrated Error) 


Experiment I 


Experiment H Experiment II 


, . Differ- i piffer- P 
Condition Systema Amount P ence P System Amount P Ser P System Amount Pence 
= o 
Incompatible A 91.5 01 476 01 A 69.3 01 3.4 25 A 83.9 o1 736 
Display- A-A 439 o A-A 65.9 ol P 103 01 
Control is F 
Left-Hand A 17.5 01 6.9 o5 A 10.2 01 65 o5 A ss os 15’ 
A-A 10.6 .05 A-A 3.7 05 P 73 05 ot 
Two-Hand A 91.5 01 538 01 A 78.7 0L 196 o1 A 92.7 01 795 
A-A 37.7 01 A-A 591 ol P 13.2 01 o 
pw A 91.5 01 671 01 A so oi mws oi A 92.7 or 725 
Coordinate A-A 244 01 A-A 67.6 01 P 20.2 .01 o! 
Secondary A 91.5 01 573 01 A 62.2 01 14.4 o5 A s49 o1 587 
Visual Task A-A 34.2 .01 A-A 478 01 P 26.2 01 0 
2.6 
Secondary A 35 05 0i 25 A is 40 a io y 28 05 > 
Arith. Task A-A 3.4 05 A-A 08 10 P 0.2 25 
cov 
è System A = Acceleration Control System, System A-A = 


trol System. 


ition 
Acceleration-Aided Control System, System p = Positi? 


nol, 
"a 


Operator Variables, System Dynamics, and Task-Induced Stress 83 


interaction between systems and duration of 
trial, when tested with Wilcoxon’s (1949) test 
of interactions, indicated that this interaction 
Was not significant (p > 0.05). This is taken 
to indicate that the differential deterioration 
shown in Fig. 3 is not significant. 

_ In Experiment IT, Wilcoxon's (1949) test 
indicated that the performance of both sys- 
tems deteriorated significantly ( < 0.05) 
with continuous running. With the exception 
of the 30- and 35-min. periods, performance 
of the acceleration control system was re- 
liably poorer (p< 0.05) than that of the 
acceleration-aided system after the first 10 
Min. of tracking. 

_ Performance of both systems deteriorated 
Significantly in Experiment IH with duration 
of trial (p < 0.05). Except for the 5-, 15-, 
55-, 60-min. periods, performance of the ac- 
celeration control system was significantly 
Poorer than the position control system 


(P < 0.05). 


80 EXP I 
O ACCELERATION 
e O ACCELERATION— 
AIDED 
60 
m A 
5 40} a 
Se a nl 
œ 
a op EXP T oy 
a a i O ACCELERATI! 
SE 80 7 D ACGELERATION— 
es aI AIDED 
Gg 60: 
> 
aS 40 
Sd 
zE 2 
So o 
S= opm fal exe M 
z n O ACCELERATION 
= 80 fl O POSITION 
8 60 
= 
40 
fe iil 
o ten n e 
x) 
a 9 Jð 
‘a 
p 9 ES £ K/K 
©] ay ZJF zx 
E S/S/3/9S/ SIE 
Bl AAG 
S s5aau5 
CONDITIONS P 
Fig. 4 -essed and under 
oe ye r unstresse 
System performance The stress 


several conditi k-i ed stress. p 
ties an fan E ie right: Incompatible 
'splay-Control Relationship, Left-hand Tracking, 
Wo-hand Tracking, Two-coordinate Tracking, a 
"dary Visual Task, and Secondary ‘Arithmetic Task. 


The results of the other stress conditions 
are summarized in Table 1 and are presented 
graphically in Fig. 4. The “unstressed” con- 
dition in Fig. 4 represents an average of the 
medians of system performance on the last 
five training sessions. The columns labeled 
Amount in Table 1 represent the median dif- 
ference between system performance under 
a stress condition and the unstressed condi- 
tion. The columns labeled Difference repre- 
sent the difference in the amount of deterio- 
ration between the performances of the two 
systems employed in each experiment. Wil- 
coxon’s (1949) matched replicas test was used 
to obtain the p values for the amount of de- 
terioration and his unmatched replicas test 
was used to obtain the p values for difference 
in deterioration. 

The results shown in Table 1 and Figs. 3, 
4 may be summarized as follows: 

Experiment I. The performance of the 
acceleration-aided control system was consid- 
erably better than that of the acceleration 
control system throughout the entire training 
period. When these systems were operated 
under conditions which stressed the human 
operator this difference between system per- 
formance was accentuated. 

Experiment II. Under a majority of the 
stress conditions, performance of the accelera- 
tion control system was significantly poorer 
than that of the acceleration-aided control 
system. 

Experiment III. Under conditions which 
stressed the operator, performance for both 
systems deteriorated, but the deterioration 
was greater for the acceleration control 
system. 
` Brief mention should be made of perform- 
ance on the secondary tasks. In none of the 
ents were reliable differences in target- 
ce found between groups 
when tested with Wilcoxon's (1949) test for 
unpaired replicas (p > 0.05). Likewise, no 
reliable differences (p > 0.05) were found in 
the arithmetic performances of the groups 
in Experiments I and III (either when track- 
ing or when not). However, in Experiment 
II, Ss in the acceleration-control group did 
better (p < 0.05) than Ss in the acceleration- 
aided control group (both when tracking and 
when not). These results are taken as evi- 


experim 
detection performan 


84 W. D. Garvey and F. V. Taylor 


dence that the differential deterioration of 
system performance under these conditions 
were not due to differential effort on the 
secondary task at the expense of the primary 
task. 


Discussion 


Many previous studies have indicated that 
the nature of the dynamics is an important 
variable in determining tracking system per- 
formance under laboratory conditions. Aided 
tracking is generally found to be superior to 
unaided (Birmingham & Taylor, 1954; Cher- 
nikoff, Birmingham, & Taylor, 1955; Fox- 
boro Co., 1945) and position control has been 
shown to be better than pure velocity control 
with medium and high frequency inputs 
(Chernikoff & Taylor, 1957; Lincoln & Smith, 
1952). One of the purposes of the present 
study was to determine whether or not the 
order of merit of tracking systems, estab- 
lished in the absence of stress through 
manipulation of system dynamics, would be 
altered by task-induced stress. The findings 
are clear in indicating that such is not the 
case for the systems studied. The better 
systems retained their advantage under stress. 

This result held up even when the dynamic 
variable was counterbalanced by selection (Ex- 
periment II) and by training (Experiment 
HI) in such a fashion as to equate entirely 
measured system performance before stress 
was applied. Thus, within the confines of 
the present studies the “engineering” variable 
of system dynamics Proved to be ascendant 
over the “psychological” variables of selection 
and training in determining relative perform- 
ance under stress, 

It is recognized that the generalization of 
these findings to other dynamic variables, 
different types of systems, other varieties of 
stress and forms and degrees of selection and 
training not employed here would be very 
hazardous. It is entirely conceivable that 
system dynamics could be adjusted in such 
a way as to cause the better of two man- 
machine systems (operated in the absence of 
stress) to become the less proficient under 
stress. Likewise, there are undoubtedly cir- 
cumstances in which the level of ability of 
the human operator or the extent of his train- 
ing will have more to do with man-machine 


system performance under stress than will — 
machine variables. Certainly, when system 

dynamics are held constant, the proficiency 

level of the human operator becomes an ¢X 

tremely important predictor of stressed pet 

formance. 

What is clearly needed is a far better 
understanding of (@) the information prog 
essing requirements of the operator in ain 
ent systems, (b) how men differ in ha 
ability to meet these requirements, (c) hog 
training enhances human information proce 
ing, (d) how different types of stress act : 
degrade this aspect of human performan®’ 
and finally, (e) how different system confit 
rations reflect changes in the operator’s E 
formation handling capacity. Until thes 
fundamental questions are answered, nt 
ability to predict the way in which differ 
variables will affect the performance of ma 
machine systems will remain most limited # 
very much a matter of ad hoc research. ne 

Nevertheless, although the present andi 
are of restricted generality, they do ae 
Practical significance. First, they show F 
the performance of some systems is disrup i, 
more by task-induced stress than ath 
Thus, there is no single “degradation fac ul 
by which unstressed performance can be ” ie 
tiplied to predict performance “in the fie ar 
Second, as a corollary of the above, i to 
machine systems which may be foun tio! 
differ only slightly in a laboratory evalua i 
may differ very considerably under he ird 
gencies of stressful field operations. pe of 
the variable of system dynamics may he 
Paramount importance in determining tem: 
“stress-resistance” of a man-machine Syl d 
Finally, the use of operator selection “ie 
training as a substitute for proper system ys 
sign, although it may work in the laborat pu 
is a highly questionable procedure if a to 
man components of the system will hav! 
work under stress. 


Summary ter” 

Three experiments were conducted to d° Jer 

mine the effect of stressing the human net 

ment in a man-machine system on the | 

formance of the systems. ipe 

In Experiment I, matched Ss were t2 pd 
to operate an acceleration control system 


Operator Variables, System Dynamics, and Task-Induced Stress 85 


an acceleration-aided control system. Per- 
formance of the acceleration-aided control 
system was found to be superior to that of the 
oe eration control system during training. 
hen stressed, performance of both systems 
deteriorated and the difference between the 
Performances of the two systems was accen- 
tuated. = 
Ta a ecient II, the difference in the un- 
ih performance of the two systems used 
xperiment I was eliminated by selecting 
sited trackers to operate the acceleration-aided 
lena and good trackers to operate the ac- 
“tie ion control system. Even though per- 
this ae of the two systems was equated by 
trol 2 lection procedure, the acceleration con- 
‘ind ystem deteriorated to a greater extent 
er the majority of the stress conditions. 
(avin BaPeriment III, matched Ss were trained 
5 | equal amounts of practice) to operate 
here control system and an acceleration 
a tol system. At the beginning of train- 
foun patents on the two systems was 
gr to differ; however, this difference was 
adually eliminated through training. When 
ese trained Ss were stressed, performance of 
ae systems deteriorated and that of the 
Celeration control system deteriorated to a 
Steater extent. g 


The results of these experiments are dis- 
cussed relative to training, selection, and the 
design of man-machine system. 


Received July 22, 1958. 


References 


Birmingham, H. P., & Taylor, F. V. A human en- 
gineering approach to the design of man-operated 
continuous control systems. U. S. Naval Res. Lab. 
Rep., 1954, No. 4333. 

Chernikofi, R., Birmingham, H. P., & Taylor, F. V. 
A comparison of pursuit and compensatory track- 
ing under conditions of aiding and no-aiding. J. 
exp. Psychol., 1955, 49, 55-59. 

Chernikoff, R, & Taylor, F. V. Effects of course 
frequency and aided time constant on pursuit and 
compensatory tracking. J. exp. Psychol., 1957, 
53, 285-292. 

Foxboro Company. Studies in aided tracking. 
Memorandum No. 25 to Div. 7, N.D.R.C., Fox- 
boro Co. Foxboro, Mass.: Author, 1945, 1-43. 

W. D. The effects of task-induced stress 


Garvey, 
U. S. Naval 


on man-machine system performance. 
Res. Lab. Rep., 1957, No. 5015. 

Lazarus, R. S., Deese, J. & Osler, S. F. The effects 
of psychological stress on performance. Psychol. 
Bull., 1952, 49, 293-317. 

Lincoln, R. S., & Smith, K. U. Systematic analysis 
of factors determining accuracy in visual tracking. 
Science, 1952, 116, 183-187. 

id approximate statistical pro- 


Wilcoxon, F. Some rapi A 0 
cedures. Stamford, Conn.: American Cyanamid 


Co., 1949. 


Journal oj Applied Psychology 
Vol. 43, No. 2, 1959 


THE FRAMES OF REFERENCE OF FLYING 
INSTRUCTORS 


RICHARD WANT 


Department of Air, Australia 


Krumboltz and Christal (1957) have shown 
that in assessing the capacities of their stu- 
dents flying instructors employ a relative 
rather than an absolute frame of reference. 
They point out that, as each instructor usu- 
ally has only four students, variations in 
standards that occur as a result of shifting 
frames of reference can have serious con- 
sequences: 


One student grouped with highly talented fellow 
students might fail while another student of equal 
or even less ability might pass because he had hap- 
pened to be placed with students of low ability. 
If such a condition prevails, the Air Force is not 
getting the best possible pilots, deserving men are 
failing, and the true validity of the pilot stanine is 


not being estimated accurately (Krumboltz & Christal, 
1957, p. 409). 


A copy of the paragraph containing the 
above quotation was circulated to a number 
of flying instructors in the Royal Australian 
Air Force. The almost unanimous response 
was, “That sort of thing would not happen 
here.” A senior officer summed up the gen- 
eral attitude by saying, “Flying instructors 
don’t scrub their pupils because they are not 
as good as their mates: they scrub them be- 
Cause they aren’t able to fly.” 

The aim of this paper is to present some 
figures to indicate that the frames of refer- 
ence of flying instructors in this country are 
influenced by variations in the quality of 
members entering training, despite their opin- 
ion to the contrary. These figures were ob- 
tained fortuitously, as a result of the intro- 


duction of a new pilot selection system in 
the Air Force. 


Method 


In Australia, Naval pilot trainees are trained in 
Air Force units. They wear Naval uniforms but 
live and do their training side by side with Air 
Force trainees. An effort is made to treat the 
members of the two services on a basis of scrupu- 
lous equality. If one ignores the fact that they are, 
as a rule, in a minority, there does not appear to 
be any significant feature in which the situation of 


86 


the Naval trainees differs from that of those in the 
Air Force. 

The trainees of the two 
separately. Until May, 195 
ae used by the two services, though not identic 
were similar. Candidates completed a battery i 
intelligence tests and were interviewed by a select! 
board. All Naval trainees were commissione 
the completion of their training. At the time 
figures were recorded, the Air Force trainees a 
ated as sergeants, and were considered for com tion 
sions at a later date. The Naval system of ele 
therefore placed slightly more emphasis on fe fice? 
didate’s general level of education and his “O 
qualities.” 

In May, 1955, 


i ed 
services are ana 
the methods of se 


thes? 


is- 


W 
the Air Force introduced @ ane 
selection system incorporating a “pilot star a 
which was based on tests which had been vali End 
in the United States (Flanagan, 1948) and uy - 
land. The Naval selection system remaine 
altered. 

Success and failure rates are reported 
Force and Naval students entering training 
two periods. The first, from January, 195 new’ 
April, 1955 (before the introduction of pe 
system in the Air Force) is referred to as be 
the second, from May, 1955, to April, 1957 
the introduction of the new system in 
Force) is referred to as “after” ered 

The students considered as successful coMP rig- 
all aspects of flying training, including ground = as 
ing, and graduated as pilots. Those considere of 
failures were removed from training on ancol i 
inability to learn to fly, Those removed from 3 at 
ing on medical grounds, for academic failure, jy- 
their own request, were excluded from the stua the 

Pilot stanine scores for the students entering jth 
Air Force in the period “after” were correlated ult? 
the pass-fail criterion, 


fore’ 
el 

alk 
( ai 


Psychological test 1d 3% 
of 81 students who were tested under the 3 wet? 
tem, and who entered the service during 195%» 
also correlated with this criterion. 

Results 
ntf 
il 


The figures for the number of stude 
passing and failing in the period “be ri! 
are given in Table 1, and those for the P? 
“after” in Table 2. 


re 


od 


e 

l n Fore 
The stanine scores of the 91 Air Foer 
members in the sample for the period ` & wit? 


ranged from 3 to 9. A few members “ 
scores of 1 and 2 were eliminated befor? 


——., 


Frames of Reference of Flying Instructors 87 


Table 1 


Pass-Fail Rate in Period “Before” 


Table 2 
Pass-Fail Rate in Period “After” 


Per- Per- 

: No. No. centage No. No. centage 
Service Passing Failing Total Failing Service Passing Failing Total Failing 
Air Force 108 55 163 33.7% Air Force 62 29 91 31.9% 
Navy 45 29 74 30.2% Navy 16 31 47 66.0% 
The divided into three phases, and those for the 


actual commencement of flying training. 
biserial correlation between those scores and 
the pass-fail criterion was 40 (this result 
is significant at the .001 level). Corrected for 
attenuation this correlation rises to 52s 

The test results of the $1 members tested 
under the old system yielded a correlation 
of —.09, which was not statistically sig- 
nificant. 

A study of Tables 1 and 2 reveals that, 
though there was a decrease in the failure 
rate in the Air Force after the introduction 
of the new system, this decrease was small 
and not statistically significant. The failure 
rate in the Air Force was lower than that 
in the Navy during both periods. During 
the period “pefore” the difference was not 
Statistically significant. During the period 
“after” it was significant at the .001 level 
(it may be noted that the Naval failure rate 
was more than double that of the Air Force). 

In order to determine whether the change 
in wastage patterns thus noted had been 
gradual, or whether it had occurred steeply, 
the results for the period “before” were sub- 


period “after” into two phases, in accordance 


with the dates at which members entered 
training. The resulting figures appear in 
Table 3. 


Discussion 


Table 3 indicates that the failure rate in 
the Navy did, in effect, rise steeply, and pre- 
cisely at the same time as the Air Force 
members selected by the new method entered 
training. 

With increasing industrialization in this 
country, there has been a gradual decline in 
the numbers of young men offering for tech- 
nical training in the services There is no 
reason to believe, however, that any sudden 
change in this regard occurred during 1955, 
nor is there any reason to believe that any- 
thing occurred at this time that would have 
affected the quality of the candidates offer- 
ing for the Navy without at the same time 
affecting the quality of those offering for the 
Air Force. It would seem reasonable to infer, 
therefore, that the sudden change in the 


Table 3 


e in Periods “Before” and * 


“After” Broken Up into Phases 


Pass-Fail Rat f 
s Air Force Navy 
Total Percentage Total Percentage 
vi Faili Failin 
3 Number Failing Number g 
f Entry 
— ae 
Period “Before” 
xs raup Aug t08 53 30.1% 2 31.8% 
Sept. 0533 to Aug: ’54 46 TO E a 
Sept. ’54 to April 155 64 34.470 
Period “After” 7 
sg 155 to May ’56 42 26.7% = arp 
June 56 to April 57 49 36.7% 25 8.0% 


88 


wastage patterns in the two services was re- 
lated to the introduction of the new Air 
Force selection system. The existence of a 
significant correlation between the new sys- 
tem and the pass—fail criterion, and the lack 
of correlation in the case of the old system, 
would suggest that the new system did, in 
effect, make a contribution to the efficiency 
of selection. An increase in the failure rate 
in the students not selected by the new 
method is, under the circumstances, just what 
one would have predicted on the basis of the 
principle stated by Krumboltz and Christal. 

It is to be noted that there were approxi- 
mately twice as many Air Force members as 
Naval members under training at any one 
time. The frames of reference of the instruc- 
tors would, therefore, be influenced to a ma- 
terial extent by any change in the general 
standard of the former. It is also to be noted 
that an improved selection system would tend 
to decrease appreciably the number of train- 
ees in the lower range, thus rendering con- 
spicuous to their instructors those who hap- 
pened to be in this range and exposing them 
to greatly increased risk of failure, 

If the new system has made a contribution 
to the efficiency of selection, as claimed, why 
has there not been a significant reduction in 
the failure rate? 

Since 1955, the Air Force has been expe- 
riencing the results of technical changes. Air- 
craft have become more costly. Reciprocating 
engines have been yielding to jets, and the 


Richard Want 


view has been put forward that all students 
who graduate should be capable of converting 
to jets. It would not be surprising if pres 
sures arising out of these factors had caused 
flying instructors to raise their standards. 
It is, nevertheless, a point of considerable 
interest that, if they have raised their stand- 
ards, they do not seem to be aware that they 
have done so. 


Summary 


This paper examines the failure rates ™ 
Air Force and Naval trainees trained side 
side. The method of selection of Air Fort 
trainees was altered at a given point of times 
but the method of selection of the Navê 
trainees remained unaltered. Although E 
significant change was noted in the failu" 
rate in the Air Force trainees, the failure "3 
in Naval trainees rose steeply. It was argie 
that this change in the failure rate of K 
Naval trainees could be explained in term 
of a change in the frames of reference 9 
flying instructors. 


Received April 14, 1958. 


References 


am 
Flanagan, J. C. The aviation psychology pros”) 


in the Army Air Forces. AAF Aviation Psyco 
Program Res. Rep. No. 1. Washington: 
Government Printing Office, 1948. 

Krumboltz, J. D., & Christal, R. E. Relative | 
aptitude and success in primary pilot train 
J. appl. Psychol., 1957, 41, 409-413. 


tot 
pile 
ing 


Journal of Applied Psychology 
Vol. 43, No, 2, 1959 


PERSONALITY CORRELATES OF SOCIOMETRIC 
STATUS >’ 


CARROLL E. IZARD 


Vanderbilt University 


Current sociometric ranking and rating tech- 
niques were derived from sociometry, a method 
advanced by Moreno (1934) for analyzing 
the feeling or preference relationships among 
the members of a human group. The original 
sociometric device as modified by various in- 
vestigators has been used in measuring the 
effects of psychotherapy (Kelman & Parloff. 
1957), social adjustment (Izard, Rosenberg, 
Bair, & Maag, 1953), and leadership poten- 
tial (Hollander, 1953; Izard & Rosenberg. 
1958; McClure, Tupes, & Dailey, 1951; 
Stogdill, Scott, Elton, Jaynes, Miller, Fleish- 
man, Wherry, & Bakan, 1953). Sociometric 
Measures have been found reliable (Ander- 
halter, Wilkins, & Rigby, 1952; McClure et 
al., 1951: Wherry & Fryer, 1949) and signif- 
cantly related to such criteria as academic 
grades (Williams & Leavitt, 1947), ratings 
of superiors (Hollander, 1953), graduation- 
elimination (Hollander, 1953; McClure et al., 
1952), and on-the-job ratings (McClure et al., 
1952: Wherry & Fryer, 1949). The present 
paper reports three studies of the personality 
Correlates of sociometric status. 

A measure of sociometric status in 
leadership was chosen for two reasons: it has 
been shown to be highly reliable (Webb, 
1954); and leadership seemed closely related 
Conceptually to the usual notion of status, 
especially since the groups being studied were 
Military. 


terms of 


i Beeun while with Tulane University —ONR Proj- 
ect NR 154.008, at the U, S- Naval School of Avia- 
-tion Medicine, Pensacola, Florida. Opinions oF con- 
clusions contained in this report are those of the 
author. They are not to be construed as necessarily 
reflecting the views or possessing the endorsement of 

© Navy Department. í — 

„° John H, Manhold, now with Washington Univer- 
Sity School of Dentistry, collaborated with the au- 
thor on the first of the three studies presented in this 
Paper, A full report of their joint effort, was printe 

as U. S. Naval School of Aviation Medicine Rep. - 8i 
NM 001 077.01.08 and read at the Southern Society 
Philosophy and Psychology, Atlanta, Georgia, 

4 3 


89 


Over-All Adjustment—General Medical and 
Psychogenic Factors 


In the first study it was hypothesized that: 
(a) a group of cadets with a high number of 
dispensary visits and hospitalizations would 
have a lower mean sociometric leadership 
score than the cadet population; and (b) that 
within this group, cadets judged to be in 
a psychogenic or psychosomatic classification 
would have a lower mean sociometric leader- 
ship score than the remainder of the group. 


Procedure 


Subjects. The sample selected for this study con- 


sisted of the 26 classes (N = 1080) who entered the 
Naval Air Training Program during the first half of 
1953. The Ss ranged from 18 to 27 years.of age. 
They had at least two years of college education or 
its equivalent. They were selected for the program 
on the basis of an individual interview conducted by 
a flight surgeon, 2 battery of psychometric measures, 
and the usual physical examination for naval avia- 
tion candidates. 


The health data. The investigator obtained a 


complete record of all dispensary and hospital visits 
made by the 1080 cadets during the pre-flight course 
and the first three stages of flight training. These 
records made it possible to collate for each individual 
an eight month cumulative medical history which 
showed the date reported to dispensary, complaint, 
diagnosis, treatment, and disposition of case. ' 
The sociometric measure of leadership. The socio- 
metric measure was a peer nomination form which 
carried a definition of leadership and instructions to 
nominate in order the three best and three least 
qualified Ss for leadership positions in the program 
in which the group was participating. The socio- 
metric measure was administered to cadet groups, 
each consisting of about 20 men who had been living 
and working together for 13 weeks. The resulting 
ordinal data were normalized by means of Fisher's 
rankit transformation (Fisher & Yates, 1953). 


Results 

Of the 1080 cadets in the total sample, 167 
had made five or more visits to the dispensary 
or hospital during the eight-month period. 
Sociometric data were available on 127 of 
these Ss. Their mean sociometric leadership 


90 Carroll E. Izard 


score in terms of rankits was —.259; the 
standard deviation was .985. The ¢ test for 
the difference between the observed mean and 
the population mean yielded a value of 2.98, 
Pz .OL. 

Two judges working independently and with- 
out knowledge of sociometric scores classified 
the 167 Ss who made five or more dispensary 
visits into a “psychosomatic” and a “nonpsy- 
chosomatic” group. The judges agreed on 
139 of the 167 Ss classified. Interjudge reli- 
ability as measured by Kendall's (1948) tau, 
was .66; the Pearson product-moment coeffi- 
cient estimated from this value was .86. Sub- 
sequent statistical analyses were concerned 
only with those Ss on whom both judges 
agreed and for whom sociometric data were 
available. The final psychosomatic group had 
56 Ss and the nonpsychosomatic had 47. 

The psychosomatic group had a mean socio- 
metric score of —.551 and a standard devia- 
tion of .940. For the nonpsychosomatic group 
the mean was .199, the standard deviation 
914. The Fmax test (Bliss & Calhoun, 1953) 
showed that the variances were homogeneous. 
The analysis of variance of the sociometric 
scores presented in Table 1 indicated that the 
two groups had significantly different means. 

The mean sociometric values of the psycho- 
somatic and nonpsychosomatic groups were 
also compared with the population mean of 
zero. For the psychosomatic group the ¢ was 
4.37, P < .001; for the nonpsychosomatic 
group the ¢ was 1.50, P > .10. 

With respect to number of medical com- 
plaints, the mean and standard deviation for 
the psychosomatic group was 10.84 and 7.09 
respectively. Comparable statistics for the 
nonpsychosomatic group were 6.53 and 1.67. 
The fact that the difference in these means 
was significant at the .01 level suggested that 


Table 1 


Analysis of Variance of the Sociometric Scores of the 
Psychosomatic and Nonpsychosomatic Groups 


Source df Variance F P 
Between Groups 1 14.3441 16.65 001 
Within Groups 101 8616 

Total 102 


the psychosomatic group might be lower on 
leadership chiefly because it had a greater 
mean frequency of medical complaints. To 
test this latter possibility, the psychosomatic 
group was subdivided into low and high halves 
with respect to frequency of medical com- 
plaints. The mean number of medical com- 
plaints for the low half of the psychosomatl¢ 
group was 6.56. This was practically ident! 
cal with the mean of 6.53 for the nonpsych® 
somatic group. However, for the low-fte 
quency psychosomatic group the mean socio 
metric score of —.55 was significantly lowe" 
(P < .01) than that for the nonpsychosomati¢ 
group and identical with that for the high 
frequency and over-all psychosomatic group* 


Performance in Group Activities 


In the second study it was hypothesized 
that sociometric status was related to Pe 
formance or proficiency in the activities i 
which the group is engaging. The perfor! 
ance index was suggested by the findings 
Rosenberg (1954), He showed there was 
significant relationship between time taken A 
complete the Naval Air Training proget 
and all important measures of cadet perfor 
ance during training. 


Procedure 


$ 
The Ss and the performance index. The goi 
designated in Rosenberg’s study of fiscal 1951 8"? A 
ates as fast (15 months to graduate), middles adi 
months to graduate), and slow (20 months to Elow 
ate) were utilized here as the high, middle, i è! $ 
groups on training program performance. Tar 
for these groups were 47, 100, and 50 respect’... 
Rosenberg (1954) showed that the performan? if- 
dex, time taken to complete training, did 1° he 
ferentiate the high, middle and low groups o” ict! 
selection test measures of scholastic and mecha’ on 
aptitude, but it effectively ranked these group? pe” 
the following measures of cadet performance: ` to 
flight grades, flight grades, number of flight pts: 
complete training, number of unsatisfactory 
number of accidents, and number of board 
plinary) actions. revi” 
The sociometric data. Sociometric data aS ae ip 
ously described were available on all 47 cade up 
the high performance group, on 48 in the low £ 
and on 99 in the middle group. 


Results 
pig 
The mean sociometric scores for the 1: 
middle, and low performance groups were « rd 


a 
-105, and —.230, respectively; the sta? 


Personality Correlates of Sociometric Status 91 


Table 2 


Analysis of Variance of the Sociometric Scores of the 
High, Middle, and Low Performance Groups 


Source dj Variance F P 
Between Groups 2 2.793 3.90 025 
Within Groups 191 7 


Total 193 738 


deviations were .881, .841, and 823. Table 
a presents the analysis of variance of the 
Sociometric scores for the three groups. 

The analysis of variance indicated that the 
mean sociometric scores of the three groups 
Were significantly different. Comparison of 
the groups by the ¢ test showed that both the 
high and middle performance groups had sig- 
hificantly higher sociometric leadership scores 
than did the low group. The mean for the 

igh group was greater than that of the mid- 
le group, although this difference was not 
Statistically significant. The trend over all 
groups was in the expected direction—the 
higher the performance index, the higher the 
Mean sociometric leadership score. These 
results support the hypothesis that over-all 
Performance or proficiency in the activities 
in which the group is engaging is a significant 
Correlate of sociometric status. 

The evidence for equal aptitudes 
Stoups suggests that the performance 
May reflect an affective or motivational factor. 

hterpreting the present results as evidence 
or a relationship between sociometric status 
on leadership and an affective factor such as 
heed for achievement is in keeping with the 
findings of Henry (1949), Hanawalt, Hamil- 
ton and Morris (1943), and Warner and 


Abegglen (1955). 


among 
index 


Aptitudes, Superiors’ Ratings, and a Forced- 
Choice Self-Description Inventory 
In this study it was hypothesized that rele- 
Vant aptitudes, superiors’ ratings, and a forced- 
Choice leadership inventory would correlate 
Significantly with sociometric status. 


Proc edure 


Subjects. The Ss selected for this 
O cadets in the fortieth through 


study were the 
the forty-sixth 


classes that entered the Naval Air Training Program 
in 1953. 

Tests and ratings. The aptitude tests were the 
Aviation Classification Test (ACT), a measure of 
scholastic aptitude or general intelligence; the Me- 
chanical Comprehension Test (MCT), a measure of 
mechanical aptitude; and the Physical Fitness Tests 
PFT), a measure of physical aptitude for activities 
involving the large muscles. These three tests were 
administered during the first week of training. 

The superiors’ ratings utilized in this study were 
ratings of Officer-Like-Qualities (OLQ) made after 
13 weeks of training and entered in official Navy 
records. They represent an over-all rating on per- 
sonal characteristics relevant to success as a naval 
officer. 

The forced-choice personality measure utilized in 
this study was the ROTC Self Description Inventory 
developed by Brogden and his associates (Brogden, 
Machlin, Loeffler, Newkirk, & Yaukey, 1952). For 
purposes of this study navy terminology was substi- 
tuted for army terminology and the inventory was 
designated as the Navy Self Description Inventory 
(NSDI). The original inventory was empirically 
derived and validated against ROTC and West Point 
aptitude-for-service ratings. To date there has been 
no attempt to determine the specific factors meas- 
ured by the inventory and little can be said along 
this line except that the factor or factors measured 
are nonintellective in nature. The NSDI was admin- 
istered to the cadets during the thirteenth week of 
training. The sociometric measure of leadership was 
the same as that utilized in the two preceding studies. 


Results 


The relationship of each of the five hy- 


pothesized correlates to the sociometric meas- 
ure of leadership and to each other is shown 
in Table 3. All of the intercorrelations among 
the five correlates are low enough that they 
can be considered relatively independent. 
The correlation of each of the five with the 


Table 3 


the Self-Description Inventory 
(PA), Scholastic Aptitude 
e (MCT), Superiors’ Rat- 
Measure of Leader- 


Intercorrelations of 
(NSDI), Physical Aptitude 
(ACT), Mechanical Aptitud \ 
ings (OLQ), and the Sociometric 


ship (SML) 


(v = 330) 
wept PA ACT MCT OLQ SML 
a te =O 28 DT 
Co w W 
ACT 38 21 19 
MCT 19 2 
OL i 


92 Carroll E. Izard 


sociometric measure of leadership is signifi- 
cant at the .01 level except MCT where 
P= 05. 

Four of the five correlates in Table 3 are 
tests which can be administered to candidates 
prior to entering the pilot training program. 
A multiple correlation coefficient was com- 
puted to determine the effectiveness of these 
measures for predicting sociometric status and 
thus for selecting Ss with leadership potential. 
The multiple correlation was .40; after cor- 
recting for shrinkage, cR was .39. 

A second multiple correlation was computed 
to determine the amount of variance in the 
sociometric measure that could be accounted 
for utilizing all five correlates. This R was 
only .67 (cR = .66), not significantly differ- 
ent from the product moment correlation of 
.66 between sociometric status and superiors’ 
ratings. 

Further examination of the correlation ma- 
trix in Table 3 affords some noteworthy obser- 
vations about the personality measures under 
consideration. By far the highest product 
moment correlation was between officers’ rat- 
ings of the cadets (OLQ) and sociometric 
status or the cadets’ ratings of each other 
(SML). This product moment correlation 
(.66) was considerably higher than the mul- 
tiple correlation (.40) of all the other vari- 
ables (aptitude tests and self-description 
inventory) with sociometric standing. These 
test and inventory indices correlated almost 
identically with superiors’ ratings and socio- 
metric ratings. The first order correlation of 
.66 was essentially equal to the multiple cor- 
relation of .67 between all the correlates and 
sociometric status. It follows that all the 
test and inventory measures together only 
account for part of variance common to supe- 
riors’ ratings and the sociometric measure. 
This is a situation where “subjective” (socio- 
metric) ratings based on direct observations 
of behavior are quite superior to “objective” 
(test, inventory) measures in predicting a 
criterion. 

Summary 


The three studies presented in this paper 
were designed to ascertain some personality 
correlates of sociometric status. In the first 
two studies, sociometric status was validated 
against rather holistic behavioral indices— 


one based on psychogenic factors in health, 
the other on performance in the activities in 
which the group was engaging. Groups of $s 
categorized in terms of these indices differed 
significantly on the sociometric measure- 
These findings were interpreted as evidence 
supporting the frequently made but infre- 
quently tested assumption that sociometrie 
measures reflect meaningful personality vat! 
ables which can be reliably measured in terms 
of observable behavior, : 

The third study examined the relationshiP 
of three psychometric indices of personality» 
physical aptitude, and superiors’ ratings w 
Sociometric status. All five of these measures 
correlated positively and significantly v 
sociometric status. The four tests which a 
presently usable in selection yielded a na 
tiple correlation of .40 with sociometric ta 
measured in terms of leadership. This po” 
toward the feasibility of developing a a 
battery for the selection of individuals W! 
leadership potential, the 

In studying the relative effectiveness of fh 
various correlates in accounting for the var 
ance in sociometric status, superiors’ pert, 
were better than the four test and invent? $ 
indices combined. However, superiors’ '” 
ings and sociometric status have to be A 
sidered as essentially concomitant criteria 4 
personality variables (in the present ee 
intermediate criteria of leadership), while © 
test and inventory measures may be cons! 
ered as predictors of these criteria. 


Received April 28, 1958. 


References 


Anderhalter, D. F., Wilkins, W. L, & Rigby; 
Peer ratings. St. Louis Univer., Office of 
Research Contract N7onr-4082 (NR 151- 
1952, Tech. Rep. No. 2. ang ol 

Bliss, C. L & Calhoun, D. W. An outline p, 
biometry. New Haven: Yale Co-operative 
1953. gune 

Brogden, H. E., Machlin, Claire T., Loeffler, 7 -on 
C. Newkirk, G. F., & Yaukey, D.W. Construction 
and validation of the ROTC Self-Desctip” 
Blank, Forms I and II (DA AGO PRT 1743-¥'pe 
Personnel Research Section, Adjutant cea 
Office, Dept. of the Army, 1952, PRS Rep. 8 jo- 

Fisher, R. A. & Yates, F. Statistical tables for New 
logical, agricultural, and medical research. 
York: Hafner, 1953, M b 

Hanawalt, N. G. Hamilton, C. E., & Morris, non” 
Level of aspiration in college leaders and 


z 
Mi yal 
092) 


Personality Correlates of Sociometric Status 93 


leaders, J. abnorm. soc. Psychol, 1943, 38, 545- 
548. 

Henry, W. E. The business executive: The psycho- 
dynamics of a social role. Amer. J. Sociol., 1949, 
54, 286-291. 

Hollander, E. P. A further consideration of peer 
nominations on leadership in the Naval Air Train- 
ing Program: Prediction of completion or failure. 
USN Sch. Aviat. Med. Rep, 1953, No. NM 001 
058.16.02. 

Izard, C. E. & Rosenberg, N. Effectiveness of a 
forced-choice leadership test under varied experi- 
mental conditions. Educ. psychol. Measmt, 1958, 
18, 57-62. 

Izard, C. E., Rosenberg, N., Bair, J. T., & Maag, 
C. H. Construction and validation of a multiple- 

‘choice completion test: An interim report. USN 
Sch. Aviat. Med. Rep, 1953, Res. Proj. NM 001 
077.01.02. 

Kelman, H. C. & Parloff, M. B. 
three criteria of improvemen 
Comfort, effectiveness, and self-awareness. 
norm. soc. Psychol., 1957, 54, 281-288. 

Kendall, M. G. Rank correlation methods. 


Charles Griffin, 1948. 


Interrelations among 
t in group therapy: 
J. ab- 


London: 


McClure, G. E., Tupes, E. C., & Dailey, J. T. Re- 
search on criteria of officer effectiveness. Hum. 
Resour. Res. Cent., Res. Bull., 1951, No. 51-8. 

Moreno, J. L. Who shall survive? Washington, 
D. C.: Nerv. and Mental Dis. Pub. Co., 1934. 

Rosenberg, N. Time to complete Naval air training 
as an additional criterion of success. USN Sch. 
Aviat. Med. Proj. Rep., 1954, Proj. No. NM 001 
077.01.04. 

Stogdill, R. M., Scott, E. L., Elton, C. F., Jaynes, 
W. E., Miller, P., Fleishman, E. A., Wherry, R. J., 
& Bakan, D. Aspects of leadership and organiza- 
tion. Ohio State Univer., Office of Naval Research 
Contract N6Ori-17, T. O. III NR 171 123, 1953. 

Warner, W. L. & Abegglen, J. C. Big business lead- 
ers in America. New York: Harper, 1955. 

Webb, W. B. Reliability of peer ratings. In Studies 
in selection and training. USN Sch. Aviat. Med. 
Rep, 1954, No. NM 001 058.25.13. 

Wherry, R. J., & Fryer, D. H. Buddy ratings: Popu- 
larity contests or leadership criteria? Personnel 
Psychol., 1949, 2, 147-159. 

Williams, S. B., & Leavitt, H. J. Group opinion as 
a predictor of military leadership. J. consult. 
Psychol., 1947, 11, 283-291. 


Journal of Applied Psychology 
Vol. 43, No. 2, 1959 


THE PROBLEM OF PRESELECTION IN WEIGHTED 


APPLICATION BLANK 


STUDIES 


JAMES H. MYERS anp WADE ERRETT 


Prudential Insurance Company, Los Angeles 


The weighted application blank has come 
into wide use as a selection tool in recent 
years. Generally, the development of such a 
tool proceeds by establishing criterion groups 
(good vs. poor employees, terminated vs. 
present employees, etc.) and comparing these 
groups on a number of biographical data or 
personal history items (age, marital status, 
education, etc.). Weights are assigned to 
each item in accordance with its ability to 
discriminate between the criterion groups. 
Weights so developed are applied to new ap- 
plicants to predict later success. 

In weighted application blanks, as in men- 
tal testing, the problem of Preselection arises, 
For validation Purposes, investigators are 
limited to persons actually hired, However, 
a careful search of the literature fails to re- 
veal a single weighted application blank study 


where any attention has been given to the 
amount of preselection which has occurred; 
not to mention the effect this should have 
upon the design of the final instrument for 
screening applicants, 


Preselection 


Table 1 illustrates the amount and degre? 
of preselection which can occur in a prie 
situation. These data are based upon ie 
applications for clerical jobs in Prudentia is 
Western Home Office in Los Angeles. It ‘ 
interesting to note that of the 19 biographt 
cal-type items, 10 were already being used at 
a basis for Selection, with confidence limits 
or beyond the 001 level! ns 

Table 1 also shows that, among perea 
actually hired, only five of the 19 items (N E 
2, 5, 6, 7, and 16) were found to discrimina 


Table 1 


Discrimination Levels 


of Bio-Data Items 


Hired vs. Terminated vs. 
Nonhired Nonterminate 
Item b b d 
1. Occupation 30 -20 
2. Courses liked best in High School -70 .02 
3. Attendance at Business or Technical School <.001 30 
4. Would like to go to College now <.001 -20 
5. Plan to 80 to College in 5 years 10 001 
6. Number of acquaintances working at Prudential <.001 <.001 
7. Number of close friends working at Prudential .01 02 
8. Amount of money needed for living expenses <.001 30 
9. Number of friends attending college 50 50 
10. Years at Present address 001 10 
11. Years in Los Angeles <.001 30 
12. Years in California <.001 50 
13. If hired, when available for work 50 50 
14. Expected starting salary <.001 -80 
15. Expected salary after one year <.001 -70 
16. Difference between expected start and year-end salary -20 .02 
17. Reason for selecting Prudential <.001 -20 
18. Weight .95 .20 
19. Height .50 .20 


94 


Preselection in Weighted Application Blank Studies 95 


between terminated and nonterminated em- 
ployees at or beyond the .05 level of confi- 
dence. The usual procedure would call for 
applying the weights from these five items to 
all incoming applicants as an integral part of 
the selection process. However, such a pro- 
cedure assumes no predictive value for the 10 
items on which significant preselection has 
Occurred. To the extent that these items do 
have validity, the final selection instrument 
described above will be less than maximally 
effective. As a matter of fact, if the five items 
were to replace the existing selection pro- 
cedures, they could actually worsen the se- 
lection process by not allowing the preselec- 
tion items to operate. 

An example will help clarify this. It can 
be seen from Table 1 that neither “years in 
Los Angeles” nor “years at present address” 
distinguished turnover proneness in the group 
actually hired. One would “logically” expect 
less turnover among settled members of the 
community. However, less settled applicants 
were very probably not hired, judging from 
the significant amount of preselection indi- 
Cated for these items. Yet, the five-item scor- 
ing key would make no provision for screen- 
Ing these applicants out. ue 

At least one of the 10 preselection items 
(number of acquaintances working at Pru- 
dential) was found to be predictive of turn- 
Over at the .001 level, even among the greatly 
restricted range found in the group actually 
hired. The failure of the other nine items to 
Predict among the employed group may well 
be due in some measure to the significant pre- 
Selection, This could also account for the 
fact that “logical” items often fail to predict 
in other weighted application blank studies. 


Discussion 


* It is, of course, impossible to determine the 
Predictive value of preselection Lc mle 
iting all applicants for a period of time ie 
following their later progress. Clearly, how- 
ever, something should be done to make the 
final weighting system as effective as possible. 
hree possibilities suggest themselves: - 
1. Employ the weights developed in i 
Usual manner only to those applicants who 
have survived all steps of the normal screen- 


ing processes; i.e., allow preselection to oper- 
ate prior to utilizing the weights. This would 
be the simplest procedure. 

2. Apply “restriction in range” corrections 
to individual item validity co-efficients, where 
assumptions can be met. This, of course, can 
be done only where the items are continuous 
(age, height, weight) and not discrete (mari- 
tal status, occupation). 

3. Develop “preselection weights,” based 
upon differences between those hired and 
those rejected. These weights could then be 


used in one of two ways: 


a. In a two-stage screening process. Pre- 
selection weights would be applied first 
to incoming applicants, to predict whether 
or not they would have passed the nor- 
mal screening process. Then weights 
based on those actually hired could be 
applied only to those passing the initial 
screening, to further refine the predic- 
tion of later progress. 

b. Both sets of weights (preselection and 
usual weighted application blank weights) 
for any given item could be combined 
to produce a single weight. This would 
be generally satisfactory where both 
weights were of the same sign (+ or —), 
but would present problems in the case 
of opposite signs. 


The type and degree of preselection being 

used is known only at the time of a study. It 
is primarily a function of the emphases of the 
employment interviewers (within the frame- 
work, of course, of fair employment prac- 
tices). Since these emphases can change at 
any time and alter, or even reverse, the pre- 
selection items in use, No. 3 would appear to 
be the best way to handle the prescreening 
items. 
It should be noted that the approaches to 
handling this problem suggested above assume 
validity for the items on which significant 
preselection has occurred. With lack of evi- 
dence to the contrary, and with some to the 
affirmative (e.g., “number of acquaintances” 
item), it seems to us that this course of ac- 
tion is safer than assuming no validity for 
preselection items. 


Received May 13, 1958. 


Journal of Applied Psychology 
Vol. 43, No. 2, 1959 


EFFECTS OF NOISE ON HUMAN PERFORMANCE ' 


HARRY J. JERISON 


Antioch College 


Until about 1948, the only proper answer 
to a question on possible effects of noise on 
nonauditory performance would have been 
that none had been demonstrated. Kryter 
(1950), who reviewed the experimental evi- 
dence available then, concluded that nearly 
all, if not all, studies showing deleterious ef- 
fects of noise could be criticized severely on 
the basis of faulty procedures, Since that 
time, Broadbent (1953, 1954) has demon- 
strated changes in working efficiency on tasks 
involving vigilance (alertness) and on a self- 
paced or externally paced serial reaction task 
provided the tasks were performed without 
interruption for relatively long time periods. 
The experiments to be described confirm 
Broadbent’s results on vigilance and indicate 
additional measurable performance changes 
in relatively high energy noise fields. 


General Procedure 


In the three experiments to be reported here 
the general procedure was to run Ss individu- 
ally through three work sessions with one- 
week intervals between sessions. Subjects 
were paid volunteer male undergraduates. 
After all of the Ss for a particular experi- 
ment were chosen they were assigned ran- 
domly to two subgroups. The subgroups were 
constituted to counterbalance order effects, 
and the order of undergoing various proced- 
ures is indicated in Table 1. The training ses- 
sion, Session I, was one hour long for Experi- 


1 This article is based on a paj 
the Aero Medical Association in 
ports the results of experiments performed in 1954 
and 1955 while the author was at the Psychology 
Branch, Aero Medical Laboratory. The preparation 
of this report was supported by the United States 
Air Force under Contract No. AF 33(616)-6095, 
monitored by the Aero Medical Laboratory, Direc- 
torate of Laboratories, Wright-Patterson Air Force 

hio. 
an and criticisms of Virginia L, Senders 
and W. Dean Chiles on various phases of the experi- 
ments reported here are gratefully acknowledged. 
The author is also indebted to Arden K. Smith, Ben- 
jamin Chi, and Shelley Wing who served as research 
assistants. 


per presented before 
April 1956. It re- 


96 


ment I on vigilance and two hours long for 
Experiments II and III, 


The designation “quiet” in Table 1 refers 
to a noise that was used to mask the sounds 
of equipment. In Experiment I this was 
about 83 db re 0002 dyne/cm*, and in Ex 
periments II and III it was about 77.5 a 
The designation “noise” refers to the H 
level noise which was our major concern. 1" 
Experiment I it was about 114 db, and in Ex 
periments II and III it was about 111.5 E 
A spectral analysis of the noise is presentet 
in Fig. 1. The noise was generated eka 
tronically and broadcast by a loudspeak? 
mounted in the S’s room. 


Method and Results 
Experiment I: Noise and Vigilance 


k 
The purpose of this experiment was to ae 
Broadbent's Previously reported results that Pet 
formance on a prolonged vigilance task was PO tor 
in noise than in quiet. The S's task was to mack 
a panel of three Mackworth-type clocks (cf. nder 
worth, 1950) and to press a response switch UP", 
a clock when its hand stepped through twice 
usual excursion. The apparatus is illustrate 
Fig. 2. Double steps occurred haphazardly at 1”! 


í e 
vals that averaged about once a minute for 
clock, 


in 


ach 


The results of this experiment are summi 
rized in Fig. 3 which gives the average 2 j 
centage correct for the nine Ss of this exp J 
ment during their experimental and co? ie 
Sessions. It should be noted that average p e 
formance during these two sessions when ” firs 
levels were the same, that is, during the dur 
half hour, was about 10 per cent better pe- 
ing the control session. The difference ird 
tween the sessions during the second and t” pt 
half hours when the 114 db noise was pres 
for the experimental session should, Bi. 
fore, not be attributed to an effect of ™ we 
The parallel orientation of the two cur. 
during the first one and one-half hours 
cates that noise had essentially no effect 


OO OOOO E M 
C—O 


Effects of Noise on Human Performance 97 


Table 1 


General Experimental Design 


Session I 


Session II Session III 


Training 
(Quiet throughout) 


Subgroup QN 


Training 


Subgroup NQ 
(Quiet throughout) 


Experimental 
(24 hour quiet followed 
by 114 hours noise) 


Control 
(Two hours quiet) 


Control 


Experimental 
(Two hours quiet) 


(14 hour quiet followed 


by 114 hours noise) 


; 7 A 
Note.—Sessions were held at one-week intervals. 


Performance at that time. During the fourth 
half hour the two curves diverge considerably 
Suggesting that noise may depress perform- 
ance only after a fairly considerable period 
of time. 

An analysis of variance of the data of this 
experiment is presented in Table 2. The dif- 
ference between average performance during 
the experimental and control sessions was not 
Statistically significant (.20 > P > 10). The 
difference between rate of change of perform- 
ance for the two sessions (the sessions by time 
at work interaction) was significant at the .05 
level. This supports the impression one gets 
from viewing Fig. 3 that the differentiation 
of performance in the fourth half hour is a 

true” effect, A more detailed report of this 
experiment has been prepared for limited cir- 
Culation (Jerison & Wing, 1957). 


5 
T 


fo} 
fe} 
T 


70}- 


D 
[e] 
T 
na 


OVERALL: 


DB RE .0002 DYNES / CM? 
œ 
ce} 
T 


a 
O 
T 


n f jit 
02 os Is 5 6 2 24 48 96 
FREQUENCY (KC / SEC) 

Fic. 1. Octave band analyses of noise used in 


e of “Noise” in 
ts II and MI 
»  Over-all 
t the right. 


these experiments. Upper curves ar 
Xperiment I ( ) and Experimen 
a ole Lower curves are of “Quiet. 
Ound pressures (02-20 ke.) are shown a 


Before going on to the next experiments it 
is of some interest to note that vigilance as 
measured here did not become less adequate 
as a result of fatigue alone. This result, the 
absence of a performance decrement during 
the two-hour control session in quiet, is con- 
trary to that reported by Mackworth (1950) 


a 


Fic. 2. The display and response panels of Ex- 
periment I. Dial pointers normally stepped through 


3.5 degree arcs. 


o- — — — "Quiet" 
e—— "noise" 


E 
o 
wW 
« 
« 
o 
o 
i 
z 
w 
o 
Œ 
wi 
a 
205 30 60 90 120 
TIME AT WORK (MIN) 
Fic. 3. Average performance of the nine Ss in Ex- 


periment I during successive half hours of the experi- 


mental and control sessions. 


98 


Table 2 


Analysis of Variance for Experiment I 


Harry J. Jerison 


Mean 
Source df Square F 

Subjects (S) 8 6544.90 

Experimental conditions (E) 1 8490.08 2.93 
EXS 8 2900.09 

Clocks (C) 2 489.31 1.24 
CxS 16 396.08 

Time at work (T) 3 479.67  6.32** 
TXS 24 75.89 

EXC 2 280.52 1.18 
EXCXS 16 238.36 

EXT 3 600.47 3.48* 
EXTXS 24 172.60 

CXT 6 138.63 1.32 
CTHS 48 105.09 

EXCXT 6 253.10 2.07 
EXCXTXS 48 122.13 

Total 215 


* Significant at the .05 level, 
** Significant at the .01 level, 


for a simpler vigilance task. No explanation 


for this discrepancy will be atte: 


it is discussed 


mpted here; 
in greater detail elsewhere 


(Jerison & Wing, 1957) and has been found 
again in a subsequent experiment with the 
same task (Jerison & Wallis, 1957). 


Experiment II: 
Counting 


The procedure in this experiment w; 
as a result of a suggestion by Miles (1 


Noise and Complex Mental 


as developed 
953) that Ss 


~~ 
/ \ 
\ j 
\ / 
| SS 


x- 
Fic. 4. The display and response panels of i 
periment II. Behind the display is the loudspeé 
cabinet. 


working in high energy noise fields could not beer 
an accurate count of how far they had gone ir 
repetitive task. The complex mental counting 45): 
is described in detail elsewhere (Jerison, 19° 
Briefly, it consists of a display of three periodic 
flashing lights; the S’s task was to count the BY 
ber of times each light flashed and to maintain wate 
rate counts for each light. He responded by PE iad 
ing a button under a light when that light jight 
flashed N times and began the count for that 
again. (For this experiment WN was always eri- 
The display and response panels used in this EXP ak 
ment are illustrated in Fig. 4. Behind the disP r 
is the loudspeaker which broadcast the noise. 

teen Ss were used. 


mi 
80 Ai ] a J| 
GROUP QN GROUP _NQ 
SESSION 11-QQQQ © SESSION II-QNNN *—e 
~ SESSION III-QNNN ie SESSION II|-QQQQ o-—0 
Q 70 | 
SS 
Š pda 
S Dia aaa 
K 
= 60 i 
Ww I 
S 
&S 
Qe 
50 si 
O5 30 60 30 20 0 30 60 30 120 
TIME AT WORK (MINUTES) 
Fic. 5. Performance of the 14 Ss of Ex 


ject subgroups “QN” and “NQ” during su 
trol sessions. 


periment IT given separately for the seven-sub- 
ccessive half hours of the experimental and con- 


Effects of Noise on Human Performance 
99 


Pe a relevant results of this experi- 
avean presented in Fig. 5 which shows the 
re ad pane of correct responses for 
and tied « groups separately for the second 
nce ig tia Subjects in subgroup QN 
a e tore in performance during 
ANE A a tours of the second (quiet 
ian a ) session. In the third session, 
Ap cn level was raised to 111.5 db 
incre half hour, a small decrement 
relative a the performance curve is 
hord at. Subjects in subgroup NQ 
ee steady decrement from their high 
F a nce level of the quiet first half hour 
the meee (experimental) session after 
Pe ranea was raised, with a total fall in 
trol) uae of over 25%. In the third (con- 
atem er in quiet this group repeated the 
about eget Ny drop in performance of 
by cain This general effect (the sessions 
fie w mental conditions by time interac- 
pe A significant at the .001 level. A 
se Ph the rather lengthy analysis of 
See et = this experiment 1s presented ina 
Teresa n ed report for limited circulation 
son, 1956). 
pace result suggests that working on this 
the tee difficult task for two hours under 
RO so regime conditioned Ss to a pro- 
conditio breakdown of performance, and this 
duent ning was maintained in the subse- 
Sh her: session. Working in quiet first, 
Ss Eo er hand, appeared to dispose the 
ee i maintaining their original perform- 
tained “i : ig this tendency, too, was main- 
Pisce =. subsequent session despite the 
periment o ‘er T in that session. Recent ex- 
DEAF to s by Broadbent (1957, 1958) ap- 
support this finding. 


Experi 
periment III: Noise and Time Judgment 


task the Ss of Ex- 
elegraph 
at they 


fee performing the counting 

key (illu II were also required to press a t 

judged strated in Fig. 4, lower right) at wh: 
to be 10-minute intervals. 


im e Baii results of Experiment III are 
age rate in Fig. 6 which shows the aver- 
Cessive h age S's responses during suc- 
Control Ai yi hours of the experimental and 
ined R (The subgroups were com 
he ty cause no order effect appeared here.) 
esults were analyzed with ¢ tests. The 


o 77.5 DB OASL 
eiil.5 DB OASL 


© 


TIME (MIN.) JUDGED EQUAL TO 10 MIN. 
@ 


7 L EXPERIMENTAL — 
; | || 
o 30 6o 90 120 


TIME AT WORK (MIN.) 


Fic. 6. Time judgments for the experimental and 
control sessions of Experiment III during successive 


half hours. 


differences between half hours within the con- 
trol session were not statistically significant, 
nor was the difference between time judg- 
ments in the first half hour of the control 
and experimental session significant. The 
difference between the first half hour and 
succeeding half hours of the experimental 
session were all significant at the .05 level or 
better, and the difference between the aver- 
aged judgments of the last one and one-half 
hours of the control and experimental sessions 
was significant at the .02 level. In other 
words, a significant difference was found be- 
tween time judgments as measured in this 
when the comparison was be- 
tween judgments in noise and judgments in 
quiet. A more detailed report of this experi- 
ment for limited circulation has appeared else- 


where (Jerison & Smith, 1955). 


experiment 


Discussion 


It is clear that noise produces readily meas- 
ureable changes in human performance. The 
specific changes involved in the three experi- 
ments described here are discussed in detail 
in each of the technical reports devoted to 
them (Jerison, 1956; Jerison & Smith, 1955; 
Jerison & Wing, 1957). The purpose of the 
present discussion is to consider these results 
in a more general way and to seek some con- 
stant features that appear in all of them. 


100 


One of the first problems to face is why 
it has been possible to demonstrate differ- 
ences between performance in noise and in 
quiet at all, for, as indicated earlier (cf. Kry- 
ter, 1950), most previous work on this prob- 
lem has given negative results. The main 
new feature that appears in these experiments 
is one suggested by Mackworth (1950) and 
by Broadbent (1953, 1954): Performance 
was measured over long time periods and 
conditions were arranged to allow effects of 
boredom and fatigue to interact with possible 
effects of noise. These conditions were pres- 
ent in all the experiments reported here. The 
implication is that for short. spurt-like efforts 
no performance decrements in noise need be 
expected. When sustained performance is re- 
quired, however, and the task is not intrin- 
sically challenging, effects of the sort reported 
here are likely. 

These considerations point to an interpre- 
tation of the results which deemphasizes the 
importance of noise. There is, after all, little 
reason for regarding noise as a peculiar kind 
of devil which produces such unusual inter- 
actions with fatigue and boredom. It seems 
reasonable, instead, to regard the more gross 
effects found as resulting from effects of noise 
on motivational level or emotional balance, 
in short, from noise as a source of psycho- 
logical stress. If this interpretation is correct 
we should expect similar behavioral effects 
from other experiments in which other kinds 
of stress or motivating conditions were inves- 
tigated. This is, in fact, the case. Mack- 
worth (1950) demonstrated that he 
resulted in deterioration of performa 
simple vigilance task, and sev 
showing changes 


at stress 
nce on a 
eral experiments 
in the judgment of time 
intervals of the order of minutes as a result 


of different motivating conditions have been 
reported (Filer & Meals, 1949: Gulliksen, 
1927; Rosenzweig & Koht, 1933). l 

Because stress has been introduced as an 
explanatory concept a few remarks on its 
scientific status are in order. The review by 
Lazarus, Deese, and Osler (1952) emphasizes 
the lack of systematic research on effects of 
stress on performance, and, although it at- 
tempts an analysis of theoretical approaches, 
this review does not go significantly beyond 
a statement relating psychological stress to 


Harry J. Jerison | 


changes in motivation and emotion. There — 
is danger, when using the concept of stress | 
of believing that an explanation has been 
achieved. Actually, here, and in most other 
contemporary usages of the term, we me 
achieved little more than communication 0 
intuitive judgment about the kind of situation 
with which we are dealing. d 
A final point that should be made is relate! 
to the kind of noise used. The noise Wa 
actually much softer than that found tote 
in many operational situations. Yet even @ 
these levels it was clear that “higher menta 
Processes” were affected. It is obvious 
necessary to explore effects of noises of highe 
intensity on such processes. 


Summary 


ing 
The results of three experiments rela 
performance changes to noise levels are ab 
ported. Noise levels used were about 80 n 
representing “quiet” and 110 db represent” 
“noise.” Changes in alertness as determ ey 
on a clock-watching task were found 4 ne 
one and one-half hours in noise though ae 
were found in quiet. Time judgments sr- 
estimation of the passage of 10-minute ae 
vals—were distorted by noise; Ss respo” jet 
on the average of every nine minutes in qoaa 
and every seven minutes in noise when 
structed to respond at what they judge at 
be 10-minute intervals. A significant 
complex effect of noise on a mental cou dis 
task was also found. These effects are ho- 
cussed in terms of noise as a source of ps¥° 
logical stress. 


nting 


Received May 19, 1958. 


References 


d 

n 

Broadbent, D. E. Noise, paced performance), 
vigilance tasks. 

1953, 44, 295-303. visual 

Broadbent, D. E. Some effects of noise e 5: 
performance. Quart. J. exp. Psychol., 1954, 9 jow 

Broadbent, D. E. Effects of noises of high a” gir 
frequency on behavior, 
29. 

Broadbent, D. E. Effect of noise on an 
tual” task. J. acoust. soc, Amer., 1958, 30) 
827. tivat 

Filer, R. J., & Meals, D. W. The effect of MO p. 
ing conditions on the estimation of time. 
Psychol., 1949, 39, 327-331. 


Brit. J. Psychol. (Gen. 


Ergonomics, 1957) © 


ë 
«jntelle 
in ge 


Effects of Noise on Human Perjormance 


Gulliksen, H. The influence of occupation upon the 
a as of time. J. exp. Psychol., 1927, 10, 
2-59. 

Jerison, H. J. Combined effects of noise and fa- 
tigue on a complex counting task. USAF WADC 
Tech. Rep. TR 55-360, AD 95232,2 1955. 

Jerison, H. J. Differential effects of noise and fa- 
tigue on a complex counting task. USAF WADC 
Tech. Rep. TR 55-359, AD 110506,? 1956. 

Jerison, H. J., & Smith, A. K. Effect of acoustic 
noise on time judgment. USAF WADC Tech. Rep. 
TR 55-358, AD 99641,2 1955. 

Jerison, H. J., & Wing, Shelley. Effects of noise on 
a complex vigilance task. USAF WADC Tech. 
Rep. TR 57-14, AD 110700,? 1957. 


2 AD numbers refer to ASTIA document numbers. 
Readers who are employed by or have contracts with 
a Federal agency may get these reports by writing: 
Armed Forces Technical Intelligence Agency, Arling- 
ton Hall Station, Arlington 12, Virginia. 


101 


Jerison, H. J., & Wallis, R. A. Experiments on 
vigilance II: One-clock and three-clock monitor- 
ing. USAF WADC Tech. Rep. TR 57-206, AD 
118171,? 1957. 

Kryter, K. D. The effects of noise on man. I. Ef- 
fects of noise on behavior. J. Speech Dis. 
(Monogr. Suppl. 1), 1950. 

Lazarus, R. S., Deese, J., & Osler, Sonia F. The 
effects of psychological stress upon performance. 
Psychol. Bull., 1952, 49, 293-317. 

Mackworth, N. H. Researches on the measurement 
of human performance. (Medical Res. Council 
Rep. No. 268), London: H. M. Stationery Office, 
1950. 

Miles, W. R. Immediate psychological effects. In 
BENOX report, an exploratory study of the bio- 
logical effects of noise. Chicago: Univer. Chicago 
Press, 1953. 

Rosenzweig, S., & Koht, A. G. The experience of 
duration as affected by need tension. J, exp. 
Psychol., 1933, 16, 745-774. 


Journal of Applied Psychology 
Vol. 43, No. 2, 1959 


CUES USED BY RATERS IN THE RATING OF 


TEMPERAMENT REQ 


JEWELL BOLING : 


UIREMENTS OF JOBS’ 


and SIDNEY A. FINE 


U. S. Employment Service 


This study was designed to determine if 
word and phrase cues in job definitions could 
be standardized to achieve homogeneous con- 
cepts and interrater agreement in the rating 
of so-called “temperament” requirements of 
jobs. These ratings are part of a larger re- 
search project being carried out by the occu- 
pational research program of the national 
office of the United States Employment Serv- 
ice (Studdiford, 1953). In 1950 the USES 
began a research project designed to develop 
a new ocupational classification structure. 
This structure, it was felt, should reflect for 
jobs the common worker trait requirements, 
such as aptitudes, interests, and tempera- 
ments. The design of the over-all project 
required that judgments about such worker 
requirements be made basically from the job 
definitions in the Dictionary of Occupational 
Titles (U. S. Dept. of Labor, 1949). Four 
thousand jobs were used in the research (U.S. 
Dept. of Labor, 1956). The problem was how 
to infer temperament requirements from the 
descriptive information in the Dictionary 
using as a definition of temperaments “those 
personality qualities which remain fairly con- 
stant and which reveal a person’s intrinsic 
nature.” 

A series of studies was undertaken designed 
to establish an adequate basis for making 
such judgments. These studies, which turned 
out to be essentially semantic in nature, were 
carried out in three stages. The first stage 
involved an attempt to use the concepts of 
temperaments available in the literature and 
to develop the word and phrase cues in job 
definitions which were related to them. 
Seven raters applied these concepts to the 
rating of a 50-job sample. The second stage 
involved studies to determine the best way 
to formulate the factor concepts in terms of 


carried out in 1950 and 1951. 
printed in this report are avail- 
U. S. Employment Service, Di- 
Methods, Washington 25, D. C. 


1 This study was 
More data than are 
able by writing the 
vision of Placement 


the cues obtained in the first rating. In the 
third stage a second sample of 50 jobs was 
rated by 10 raters according to the revise 
formulations. 


First Stage 


The literature, primarily Cattell and All- 
port (Allport, 1943; Allport & Odbert, 193 J 
Cattell, 1946), yielded 14 traits, defined s 
sentially in terms of the characteristics e 
people. Although most of the names of ere 
factors were found in the literature, some es 
contrived. Since some of the factors appe* # 
to be bipolar and this notion was suppor i 
by Cottle (1950), the 14 factors Were‘, 
ranged in seven bipolar pairs and define 
below. 


pags . ting 
Definitions of Temperaments—First Ra 


Adaptability to Routine vs. Versatility 


Dominance vs. Submissiveness 
Self-Control vs. Uninhibitedness 
Gregariousness vs. Self-Sufficiency 
Objectivity vs. Subjectivity venes 
Creativity vs. Non-Imaginatlv 
Rigorousness vs. Valuativeness 


Sample of Definitions ; 
with Illustrative Jobs by Title 


isp?” 
Self-Control: Disposition Uninhibitedness: Minott 
toward emotional con- sition to act W“ tion? 
trol necessary to main- restraint of ari im 
tain standard work per- inclination to real sitt” 
formance when con- pulsively to VET pt 
fronted with critical, ations without ® é ent 
annoying or unusual ing to control ex 
situations. or tension. ays 
Surgeon No illustrative JOP" 
Diver 
Fireman 


c 
a pa 
Fifty jobs were rated by seven ratet” erë 


cording to these definitions. The raters 5 pet 
instructed to select two traits that Lae of 
best expressed the temperament patte! indi 
the job and to justify their ratings PY qich 


cating the cues in the definitions 
prompted them. One factor, Uninhibit 


102 


w 
anes? 


Rating of Temperament Requirements 


was ruled out as a possible rating, since it 
was felt that it did not occur as a require- 
ment in jobs. Thus 72 possible patterns could 
be rated. It was decided in advance to use 
the patterns on which the majority of the 
raters agreed as the criterion against which 
to measure agreement. Four or more raters 
agreed on a common pattern for 28 out of 
the 50 jobs. On only one out of the 50 were 
there no agreements of two or three raters. 


Second Stage 


The second stage began with the analysis 
of the word and phrase cues developed in the 
first stage. In order to more precisely deter- 
mine which cues were operating for each 
temperament trait, jobs were selected where 
(a) all raters saw the trait, (b) some did and 
some didn’t, and (c) only one saw it or didn’t 
see it. Raters in these three instances were 
asked by questionnaire and interview to again 
justify their ratings. In addition, cues were 
obtained from three new raters rating the 
same jobs. 

All of these cues were assembled for each 
trait. Typical of the results are the cues 
used for Dominance which reflect the double- 
barreled nature of the definition: “Disposition 
to prevail, control, be at the ‘helm’; desire 
for tasks involving planning, determining pro- 
cedures, directing and organizing activities 
and/or influencing or directing the actions of 
others by suggestion, persuasion, or com- 
mand,” 

Cues such as “influences emotion of audi- 
ence by singing” and “influences the actions 
of others by suggestion and persuasion 
through writing original descriptive advertis- 
ing copy” picked up the “influencing” part 
of the definition, Other cues such as “is in 
Complete charge of stables” and “plans and 
Organizes advertising activities” picked up the 

Control” part of the definition. i 

Interviews with the raters and qualitative 
examination of their rating justifications pro- 
vided considerable insight into the judgmental 
Processes which operated in the ratings. By 
Providing the raters with definitions for tem- 
Þerament traits in terms of people and illus- 
trated by job titles, the rater had to reason 
from the definition of the trait to the illustra- 


103 


tive job titles, then to the specifics in the 
illustrative jobs, and finally to the specifics 
in the job being rated. To short-cut this 
involved process the rater in many cases sim- 
ply picked up the most obvious word or 
phrase cue in the trait definition and general- 
ized from it. The job titles used as illustra- 
tions were of little help since they did not 
specify what, in the definitions for those jobs, 
illustrated the trait. 

The second phase of the development of 
the temperament concepts got under way at 
this point. This involved setting out to define 
temperament concepts, not as traits in people, 
but as situations calling for those traits illus- 
trated with specific examples from job defini- 
tions. The language of these illustrative 
situations from the Dictionary of Occupa- 
tional Titles was revised to interpret the con- 
tent in such a way as to show how the tem- 
perament requirement was operating. To see 
if these illustrative situations containing the 
uld now be consistently related to the 
temperament defini- 
Study of cue job 


cues CO 
revised, but simplified, 
tions, the First Matching 
situations was conducted. 

Seven occupational analysts with no pre- 
in temperament analy- 
lified factor definitions 
and 50 illustrative cue 
le of the job definition 
d. Nine graduate stu- 
rchology class were 
definitions without 
letters and the illus- 
hout job titles as 


vious training or work 
sis were given 13 simp 
with their trait names 
situations with the tit 
in which they occurre 
dents in a personnel psy 
given the same factor 
names but identified by 
trative cue situations wit 
indicated below: 

Trait H—Situations involving performing 
adequately under stress when 
confronted with the critical or 
unexpected, taking risks, or hav- 
ing responsibility for the safety 


of others. 
e surface of the water, 


dressed in diving suit and helmet, to drill 
holes in rock for blasting purposes at the bot- 
tom of lake, harbor, or other body of water. 
Risk of suffocation from fouled air hose or 
entrapment by rotten, falling timbers is al- 


ways present. 


Works below th 


104 


Operates upon the human body, incising 
the flesh with very sharp bladed scalpels and 
using the fingers to manipulate organs and 
tissue. Exercises constant care, with no de- 
flection of attention regardless of distractions. 
to avoid any of several injuries or damages 
to the patient which would otherwise almost 
invariably occur. 


Observes passengers during flight to detect 
signs of discomfort, engaging nervous pas- 
sengers in conversation to allay their fears 
and apprehensions, setting an example of 
calm, untroubled demeanor. 


In both instances the instruction was to 
indicate for each cue situation the factor 
which best applied to it, and also to indicate, 
if necessary, second and third choices in rank 
order. 

The average percentage of agreement with 
the criterion was 79% for the occupational 
analysts and 69% for the students. How- 
ever, the Pearson product-moment correlation 
coefficient of .50 with a standard error of .11 
between the two groups of raters indicated 
that, although their over-all matching of the 
situations with definitions was comparable, 
they didn’t agree too well on the same fac- 
tors. Thus, the analysts saw Dominance in 
the cues that were supposed to illustrate it 
but not so the students. The reverse was true 
for Subjectivity with a mixed result in the 
cases of Rigorousness and Versatility. For 


example, one of the situations for Rigorous- 
ness was: 


Sculptor—‘Making models and carving 
statues requires patience and painstaking 
endeavor in order to achieve a work of art 
with desired line and proportion.” 


Nevertheless, two analysts rated this situa- 
tion for Creativity because of the title and 
the fact that a Sculptor is an artist. One 
rated it for Self-Control because of the word 
“patience.” 

For another illustrative situation of Rigor- 
ousness: 


Surgeon—“Because of the value placed on 
human life and responsibility residing in 
the surgeon, he must exercise utmost care 
in performance,” 


Jewell Boling and Sidney A. Fine 


three analysts rated Self-Control simply Ber 
eralizing from the title although another situa 
tion taken from this same job was written up 
to be a criterion situation for Self-Control- 
It read as follows: 


Surgeon—‘In performance of surgen, 
worker is confronted with emergency an 
or critical situations, which require him e 
remain calm and collected; if not, the P# 
tient’s life is endangered.” 


All the analysts and students agreed 0” 
rating this item for Self-Control. iff- 
This type of analysis revealed other di E 
culties with the situations. The factor cma 
tivity was defined thus: “Disposition, a eed 
ganize feelings or knowledge into new imani 
systems, or practical constructions.” Oa 
the situations used in the First Mate 0 
Study read: “Originality or the application w 
imagination important in developing an 
ways of expressing the dance created DY is 
other, or in creating a new dance. trait 
situation was supposed to illustrate the p 
Creativity but some of the raters saW i js- 
more strongly the trait Subjectivity: 5 
position to interpret phenomena in termat 
personal viewpoint; desire for sity a su 
which necessitate or permit injection of ‘thei 
Follow-up on the cues which mediated, rigi 
judgments revealed that the words ‘7, 
nality” and “imagination” appearing 1 , 
situation for Creativity influenced a Ha 
for Subjectivity rather than Creativity- , 
Another example of how a word in & Shou 
tion illustrating a trait operated as a shor ral 
cue to another trait, bypassing the bas!¢ the 
concept involved in the situation, was ity: 
word “research” in a situation for Creat! jo 
It influenced a rating for Valuativeness. g” 
a situation for Valuativeness, the word 
influenced a rating for Objectivity, yen 
nition of which contained the word crea” 
The word “scientific” in a situation for A 
tivity influenced a rating for ObjectivitY” je 
situation for Valuativeness, “Must be ters 
to judge ‘temper’ of crowd,” led some word 
to rate Gregariousness because of the Y 
“crowd.” this 
Several conclusions were drawn from 
First Matching Study. 


+ lat 
i Š z jcul 
First, certain classes of cues in pa't 


Rating of Temperament Requirements 


produced associative rather than analytical 
thinking. These were job titles, the names of 
traits, and the interpretations in the sample 
situations that were supposed to show the 
temperament requirement. This associative 
thinking did not help reliability. 

Second, the definition of a trait needed to 
be expressed through a range of situations as 
actually worded in job descriptions and thus 
illustrate the factor from as many aspects as 
possible; that is, although the wording of any 
one situation could be interpreted to reflect 
several traits, the concept of a particular trait 
had to be established in an over-all situational 
context. 

On the basis of these conclusions and study 
of the cues used by the raters in rating the 
first 50 jobs and the First Matching Study, 
the factors were once again redefined. This 
time they were defined as types of situations 
calling for traits (rather than as traits them- 
selves), and illustrated with a range of sam- 
ple situations, worded as in job descriptions. 
Thirteen factors emerged in the form origi- 
nally used with the graduate students men- 
tioned earlier, with the following changes: 
Creativity was merged with Subjectivity; 
Dominance was divided into Executiveness 
and Persuasiveness; Non-Imaginativeness was 
merged with Adaptability to Routine; and 


105 


Isolativeness was broken out of Self-Suffi- 
cency. However, it should be noted that 
these titles too were subsequently dropped for 
letter designations. 

The Second Matching Study was under- 
taken to see if these factor definitions and 
groups of situations could be matched when 
scrambled. Two groups of occupational ana- 
lysts were used for this purpose—10 experi- 
enced and four inexperienced in temperament 
analysis. 

Table 1 partially indicates the results of 
this study. With the exception of Self-Ade- 
quacy either nine out of 10 or all 10 expe- 
rienced analysts correctely related the group 
of situations to the factor definitions. Simi- 
larly, with the exception of Self-Adequacy 
and Versatility, three out of four and all 
four of the inexperienced analysts correctly 
related the group of situations to the factor 
definitions. With minor exceptions only the 
inexperienced analysts saw other than the 
criterion factor definitions as covering the 
groups of situations, but the definition and 
the cues that were responsible for this became 
For example, the definition for 
was rated as covering the 
groups of situations for Objectivity, Versa- 
tility, Executiveness, Isolativeness, Valuative- 
ness, and Subjectivity. Analysis showed that 


apparent. 
Self-Adequacy 


Table 1 


Agreements and Disagreements in Matec 


Definitions for Two Grow 


hing Groups of Situations with Temperament 


ps of Raters 


Experienced Analysts (N 


= 10) Inexperienced Analysts (V = 4) 


No. Not Seeing 


No. Seeing the t Se 
the Criterion 


No. Not Seeing 


ing th € eein 
Factors Na liea i the Criterion Criterion 
2 2 
VARCH 9 1 A 3 
MVC 10 0 j g 
DCP 9 1 : i 
ISOL 10 : i ; 
(Self-Adequacy)* 8 ; 4 
REPSC 9 à i R 
USI 10 j 4 
sje 3 i 3 1 
SJC 10 D : ` 
INFLU 10 k 3 a 
DEPL 10 $ a 
STS 10 0 ; 
FIF 10 0 
* Dropped. 


106 


this was indeed an overlapping factor that 
had too wide an application; hence Self-Ade- 
quacy was dropped as a factor. Ratings for 
Executiveness and Persuasiveness showed up 
for a group of situations that were supposed 
to go with Gregariousness, but in this case 
situations could be added and subtracted from 
the group to improve its homogeneity. 

For example, evaluation of situations in 
Gregariousness (DEPL) resulted in additions 
and subtractions to reduce overlap with other 
factors. The definition of Gregariousness was 
as follows: “Situations involving the necessity 
of dealing with people in actual job duties 
beyond giving and receiving instructions.” 
One situation for this had read: “Endeavors 
to sell gas-powered equipment to existing and 
prospective industrial and commercial cus- 
tomers to increase use of gas in territory.” 

This situation tended to function as a cue 
for Persuasiveness. The following situation 
was substituted because it functioned as a 
cue for Gregariousness only. “Makes appoint- 
ments for employer with clients or customers 
by mail, phone, or in person.” 

An example of a situation retained as nearly 
always providing stronger cues for Gregarious- 
ness than any other factor was this: “Pro- 
motes sales and creates good will for his firm’s 
products by preparing displays, touring the 
country, making speeches at retail dealers’ 
conventions, and calling on individual mer- 
chants to advise on ways and means for 
increasing sales.” 

Similar analysis resulted in the 12 defini- 
tions and groups that were used to express 
the temperament concepts in their final form. 
This form was essentially the same as pre- 
viously illustrated for Trait H, except that 
(a) numbers were substituted for letters and 
(4) verbal symbols made up of the initial 
letters of the key words in the definition were 
also added for identification, Following is a 
list of the definitions and their numerical and 
verbal designations: 


1. VARCH—Situations involving a variety 
of duties often characterized by frequent 
change. 

2. REPSC—Situations involving repetitive 
or short cycle operations carried out according 
to set procedures or sequences, 


Jewell Boling and Sidney A. Fine 


3. USI—Situations involving doing things 
only wnder specific instruction, allowing little 
or no room for independent action or judg 
ment in working out job problems. 

4. DCP—Situations involving the direc- 
tion, control. and planning of an entire ac 
tivity or the activities of others. 

5. DEPL—Situations involving the neces- 
sity of dealing with people in actual job du- 
ties beyond giving and receiving instructions. 

6. ISOL—Situations involving working 
alone and apart in physical isolation fro" 
others, although activity may be integrate 
with that of others. y 

7. INFLU—Situations involving infan 
ing people in their opinions, attitudes, 
judgments about ideas or things. f 

8. PUS—Situations involving performint 
adequately wnder stress when confronted be 
the critical or unexpected or taking risks. i 

9. SJC—Situations involving the evaluat? 
(arriving at generalizations, judgments, o 
decisions) of information against sensory 
judgmental criteria. ua- 

0. MVC—Situations involving the eval a 
tion (arriving at generalizations, judgmer z 
or decisions) of information against ° 
urable or verifiable criteria. re- 

X. FIF—Situations involving the interP' 
tation of feelings, ideas, or facts in term 
personal viewpoint. cise 

Y. STS—Situations involving the pre d- 
attainment of set limits, ¢olerances, oF sta 
ards. 


Third Stage 


The 12 revised trait definitions were apps 
to the rating of the second sample of 50 Jons 
Ten raters participated. The instruc ef 
were to select the two factors that to8®™,, 
best expressed the temperament pattern ify 
the job. Here again the rater had to Je Jed 
his pattern, i.e., indicate the cues whic 
him to his conclusions. io 

In this final rating, 30 out of the 50 Dm 
had five or more raters agreeing on the nad 
perament patterns. All the other joPS pjs 
two to four raters agreeing on patterns- pi” 
compares with 28 out of 50 jobs on ¥ itial 
there were majority agreements in the } in” 
rating. However, these latter agreeme” 


SE 


Rating of Temperament Requirements 


Table 2 


Mean Occurrence Out of 100 Ratings of Temperament Factors in the Rating of 
Two Groups of 50 Jobs 


Temperament Factors* 


Mean Occurrence* 


Initial Rating ioe hone anny 

Non-Imaginativeness 29.1 

Rigorousness STS 17.6 25.1 
Adaptability to Routine REPSC 16.4 19.1 
Versatility VARCH 6.5 r 
Submissiveness USI 4.0 jb 
Objectivity MVC 6.0 9.4 
Dominance DCP 2.0 9.3 
Creativity 4.0 

Gregariousness DEPL 4.0 3.0 
Subjectivity FIF 2.0 2.9 
Persuasiveness INFLU T4 
Valuativeness SJC 1.0 5.5 
Self-Control PUS 1.0 1.0 
Self-Sufliciency 6.5 

Tsolativeness ISOL 8 

ch in the respective ratings. 


® Blanks rae = iso Heel 

> Monks under Initial and Final Rating indicate that these 

Changed ne ccurrences of al Rating appear adjacent to t 
as explained in the text. 


ae 10 rather than seven raters, a more 
cult situation in which to get majority 
agreement. 
oe 2 shows the greater spread in the 
‘iting occurrence of the factors in the final 
öf Fd over the initial rating. This wider use 
in e factors suggests a greater understand- 
“ts of the factors in relation to the job in- 
hi mation available and a more discriminating 
Se of them in making judgments. 
DCP comparison of the cues obtained for 
at (Direction, Control, Planning), i.e., 
Situations involving the direction, control, 
and planning of an entire activity or the 
activity of others” with those obtained for 
the predecessor factor Dominance as set out 
Previously indicates how the qualitative re- 
Sults improved. Inferences that the trait is 
Operating are now justified on the basis of 
homogeneous word and phrase cues such as 
hese: “individual performance permits self- 
direction,” “is in complete charge of stables,” 
tS pretty much on his own,” “carries out 
mity without supervision,” “coordinates 
e operation,” “determines procedures,” 


‘ 
Plans and carries out an entire activity.” 


factors were not involved as su 


heir nearest equivalent in the Initial Rating, although they are 


Discussion 


The criterion used for expressing agreement 
was not very satisfactory. It penalized sig- 
nificant agreements. The cues obtained for 
Sugarcane Planter illustrate the point. Agree- 
ments on a pattern of two traits for the 10 
raters ran thiswise 3, 3, 1, 1, 1, 1. The pat- 
terns obtained were as follows: 

DCP—MVC 1 rater 


VARCH—MVC I rater 
VARCH—STS I rater 


DCP—VARCH 3 raters 
DCP—SJC 3 raters 
DCP—DEPL I rater 


It is evident from these patterns that eight 
of the raters agree on DCP and five on 
VARCH. Moreover, substantially the same 
cues were operating for the choices. Note 
the cues given as justification by the five 
raters who saw VARCH as part of the pat- 
tern: “many different tasks carried on . . . 
different tools and equipment,” “variety of 
work carried out,” “variety of duties in- 
volved,” “grows, plants, cultivates, harvests, 
markets,” “plants and cultivates . . . cuts and 
hauls . . . engages seasonal labor.” 

This same situation was true for practically 


108 


all jobs not having a majority pattern in the 
second rating of 50 jobs. ; 

Recently, Wherry (1957) in discussing the 
future of criterion research stressed the need 
for “measured interest in the field of job and 
situational analysis techniques including a 
still better definition of the needed elements 
and of methods of estimating their presence 
and importance of both criteria and tests.” 
It may be that the procedure and resultant 
factors outlined here suggest an improved job 
analysis instrument for making the situational 
analysis discussed by Wherry. At any rate 
it seems likely that we must arrive at a more 
effective understanding of the role of language 
as a mediating element in our attempt to get 
at criteria. 


Summary 


A method of rating 
ments of jobs on the 
obtained from written job descriptions was 
developed in order to reflect temperament 
information in a functional occupational clas- 
sification structure. The early procedure was 
to adapt from the literature clinical concepts 
of temperament traits as they occur in people. 
Tryout of this procedure produced associative 
rather than analytical thinking and did not 
achieve reliability. Through a series of stud- 
ies of the shared and unique word and phrase 
cues which led to raters? inferences about 
temperament requirements, concepts of tem- 
peraments were formulated not as traits in 


temperament require- 
basis of information 


Jewell Boling and Sidney A. Fine 


people but as situations in jobs requiring com- 
mon adjustments of workers. These concepts 
were defined by an over-all situational con- 
text rather than as clinical concepts of the 
traits themselves. Greatly improved relia- 
bility was obtained. It was suggested that 
defining “temperaments” in terms of the kind 
of situations to which workers must adjus 
may be an effective first step toward a more 
adequate criterion for measuring personality 
concomitants of successful job adjustment. 


Received June 23, 


1958, 


References 


Allport, G. W. Personality, a psychological interb"® 
tation. (Rev. ed.) New York: Holt, 1943. È 
Allport, G. W., & Odbert, H. S. Trait names! 
Psycho-lexical study, Psychol. Monogr., 1936, 
No. 1 (Whole No. 211). per- 
Cattell, R. B. Description and measurement of 
sonality. New York: World Book, 1946. _ 
Cottle, W. C. A factorial study of the multip ; 
Strong, Kuder, and Bell inventories using & P! 15, 
lation of adult males. Psychometrica, 1950, 
25-47, 
Studdiford, W. S, 


41, 


pasic 


r jon 

New occupational classifica) 
structure. Employment Security rev. 1953; 

(9) 5, 37. orke! 

U. S. Department of Labor. Estimates of w the 


N alua 
Wherry, R. The past and future of criterion €V’ 
tion. Personn. Psychol., 1957, 10, 4. 


ll, 


Journal of Applied Psychology 
Vol. 43, Now 2 4050 O9 


AN EXPERIMENTAL EVALUATION OF “NO- 
PRESSURE” INFLUENCE‘ 


E. PAUL TORRANCE 


Bureau oj Educational Research, University of Minnesota 


A Major problem in any influence relation- 
ship is to determine the degree of pressure 
to be used. Until some 10 or 15 years ago, 
salesmen were usually thought of as using 

fast words” and pretentious claims and doing 
much “pushing” to constitute what was 
known as “high-pressure” methods of selling. 
As business became more “professional” and 
salesmanship along with it, there emerged a 
recognition that there was a hidden resource 
mM the sales prospect himself. The prospect 
liked to buy from the salesman who gave him 
a chance to trust and respect him, something 
Mey occurred only if the prospect felt that 

e was being permitted to consider the sales- 
Te Proposition fairly and rationally. In 
a Bursk (1956) labeled this new sales 
q nique “low-pressure selling.” Since then, 

OW-pressure selling” has enjoyed consider- 
able vogue. Recently, however, Bursk (1956) 
and others have been bemoaning the fact that 
What was “low-pressure selling” has deterio- 
rated into “no-pressure” selling. 

, Even earlier than this development in sell- 
0g, counselors and psychotherapists were dis- 
Covering resources in their clients essentially 
Similar to those observed by the “low-pres- 
Sure” salesman, In 1942, Carl Rogers (1942) 
Published his formulation of counseling which 

as been labeled variously as “client-centered” 
and “nondirective.” A description of the 


Prat report is based on work done under ARDC 
tessaa No. 7723, Task No. 77461, in support of the 
ers tch and development program of the Air Force 
ir rae and Training Research Center, Lackland 
teproduee Base, Texas, Permission is granted for 
Posal uction, translation, publication, use, and oe 
tate in whole and in part by or for the Unite 
expres Government. The opinions or conclusions 
he: essed or implied herein are those of the authors. 
the med not be construed as necessarily refers 
ir Fone or endorsement of the Department of t fe 
ommand. of the Air Research and Developmen 
e author 7 he 

o + gratefully acknowledges t! 
meee Buck and Raigh Mason in planning and con- 
baria the experiment described herein and t 
data, s E. Hawkins for his work in analyzing the 


assistance 


109 


Rogerian technique is psychologically very 
much like Bursk’s description of “low-pressure 
selling.” Similarly, in the hands of many 
poorly equipped practitioners, Rogers’ tech- 
nique may be said to have drifted into a truly 
“nondirective” method. 

Even earlier, Progressive Education had 
given recognition to these resources in learners 
and attempted to devise programs of educa- 
tion which would capitalize upon them. It 
too, in the hands of ill-trained teachers, 
drifted into a rather laissez-faire approach. 
Apparently low-pressure selling, client-cen- 
tered counseling, and Progressive Education 
all require personnel of high caliber and ade- 
quate training. They all require people who 
are sensitive to the needs of others, believe in 
these inner resources of others, and are will- 
ing to fulfill their roles as leaders and re- 
source persons without resort to trickery. 

Three factors suggested a need for a re- 
evaluation of the “no-pressure” technique. 
One was Bursk’s recent discussion (1956) of 
“no-pressure” techniques as a kind of “sign 
of the times” and occasional observations that 
there may be developing a “reaction forma- 
tion” to such techniques. The second came 
from complaints of many observers that edu- 
cation has become “too soft” and needs to 
stiffen curricula and discipline. There have 
also been cries that counselors are not “per- 
suasive” enough particularly in their educa- 
tional and vocational counseling of abler stu- 
dents. The third came from previous research 
(Torrance & Mason, 1956), in which it was 


d that men who perceived their instruc- 
rt to influence them 


h the intended effort 


foun 
tors as making no effo 


behaved more in line wit r 
to influence than those who felt that their 


instructors were trying to influence them. 
The finding, however, was a post hoc one and 
required more definitive exploration. 
Procedure 
The setting of the experiment was the simulated 


nine-day survival exercise of the USAF Survival 


110 


Training School. Instructors of groups of six to 
12 men attempted to influence trainees to react 
favorably to an emergency ration known as “pem- 
mican.” The Ss were 427 aircrewmen undergoing 
survival training. All Ss received a double issue of 
the emergency ration including a total of eight 
meat bars. 

A total of 43 instructors in two successive classes 
were involved. Prior to the exercise, training groups 
were divided randomly into one control and six 
experimental groups. In small-group sessions, the 
author and two experienced colleagues trained the 
instructors in the experimental technique to which 
they had been assigned randomly. These techniques 
were designed to represent various hypothesized 
degrees of influence. 

In Experimental 1, instructors were briefed to 
make no effort whatever to influence trainees to ac- 
cept the ration. In Experimental 2, they were in- 
structed to make no effort to influence the accepta- 
bility of the ration except by eating the ration 
themselves and “setting a good example.” Those 
in Experimental 3 were asked to give information 
about the value of the meat bar as an emergency 
ration and about ways of preparing it in an objec- 
tive, “take-it-or-leave-it” manner. Those in Experi- 
mental 4 in addition were asked to emphasize to the 
group the psychological factors (group explanation) 
which affect acceptability. Instructors in Experi- 
mental 5 were given the same instructions as those 
in Experimental 4 except that they were to give 
their explanations to individuals (individual explana- 
tion) rather than to the group. In Experimental 6, 
instructors were briefed to use what we considered 
the coercive method of informing trainees that food 
indoctrination was an integral part of their training 
and that they would be “graded down” if they did 
not “really” try the ration, 

Although the hypothesized degree of pressure was 
in the order listed, questions were asked at the end 
of training so that the experimental techniques could 


E. Paul Torrance 


be reordered according to degree of pressure p 
ceived by trainees. First, they were asked outrig! 
to indicate the degree to which their instructor r 
tried to influence them. Second, they were e- 
to indicate which of several influence keepin 
their instructors had used. This list included: i, 
scribed preparation, demonstrated preparation, rh 
ration, gave nutritional facts, explained Sarit? Bs 
using in training, told about psychological eff ee J 
told use of ration would count on grade, and 
like. 5 

At the end of training, the following four pte 
of acceptability indicators were obtained from int), 
S: (a) the traditional hedonic scale (seven-P® 
requiring the S to indicate his reactions to eat 
five methods of preparing the meat bar; ating 
number of bars eaten; (c) reasons for not nee 
the remainder (made me sick, too greasy, da “use 
(d) the conditions under which the S wou 
the ration in the future. 

To determine the perceived pressure of each 
nique, a weight of “1” was assigned when BD in- 
ported that his instructor had made no effort was 
fluence him; “2” when some effort to innuens. wa 
perceived; and “3” when very much ANU 
reported. An analysis was also made to de vei? 
what specific types of influence acts were per hree 
by Ss under the “no-influence” condition. ned bY 
categories of perceived pressure were establish o 
placing: in the “no-pressure” category thor, . 
checked none of the influence acts as perform 
their instructor, in the “low-pressure” catego uctor 
who checked from one to five types Ar 
influence acts, and in the “high-pressure” © 
those who checked six or more types of i 
influence acts. 


pes 


the 


tech- 
s re 


Results se 
A in a P 
First, an effort was made to obtain a yere 
ture of precisely what instructors wh? 


Table 1 


Number and Percentage of Trainees and Instr 
Influence Acts Under “No-Pressure” Condition 


uctors Perceiving Various Instructor 


p” 
jon 
cept!? 
Trainee Perception Instructor Per 
% 
Influence Act No. % No. Yo 
9 
Described preparation 28 49 2 A 
Demonstrated preparation 10 18 1 j 
Ate meat bar during training 35 61 3 ai 
Gave nutritional facts 25 44 1 a 
Explained reasons for use in training 25 44 1 ‘i 
Advised eating small bits 9 16 1 a 
Advised not to eat when overly fatigued 5 9 0 0 
Advised to eat before becoming too hungry 10 18 0 j 
Told about psychological factors involved 12 21 1 0 
Told use of ration would count on grade 1 2 0 


Note.—N of trainees = 57; N of instructors = 7. 


“No-Pressure” Influence 


111 


Table 2 


Percentage Perceiving Various Degrees of Instructor Effort to Influence Under 


Seven Conditions 


and Pressure Indexes 


Percentages 

Condition i = Pressure Rank in 

0. None Some Much Index Pressure 
Control 76 29 55 16 187 3 
Exp. 1 (No Infl.) 57 41 55 4 159 2 
Exp. 2 (Good Ex.) 63 14 68 18 204 6 
Exp. 3 (Info.) 62 47 48 5 158 1 
Exp. 4 (Grp. Expl.) ól 21 63 16 195 4 
Exp. 5 (Ind. Expl.) 65 15 49 36 221 7 
Exp. 6 (Evaluation) 438 12 74 14 202 5 


Note.—Chi — 
te.—Chi-square (based on numbers perceiving various 


the a aber of Ss in Exp. 6 was reduced by el 
ic to make “no effort” to influence were 
er ant to do. Table 1 presents the num- 
flistret percentage of Ss who perceived these 
ifluen ors as performing each of 10 types of 
tendra ce acts listed. The instructors’ self 
these S are also shown in Table 1. From 
iffic a ani it can be concluded that it is 
iu for an instructor to deny his role as 
ioa neer. Although instructors assigned the 
ee oarrall technique reported fewer influ- 
ditio acts than those assigned any other con- 
n, they ranked fourth in seven mM the 
proportionate influence acts perceived by 
rainees, 
we econd, to determine the relative degree of 
Ce Ssure perceived for each condition, per- 
ntages were first determined for each © 


liminating one crew whose instru 


) = 56,7056; df = 8; p < 001. 


degrees of pressure; 1 
ctor was replaced by an unbriefed instructor in 


the three degrees of perceived effort. Then 
weightings were applied to yield comparable 
indexes. These data are presented in Table 
2. The over-all chi square, using the raw 
data, is 56.71 (df= 8 p< .001), indicating 
that significant effects occur among the con- 
ditions. Roughly speaking, Experimental IIT 
(Giving Information) and Experimental I 
(No Effort) come nearest qualifying as “no- 
pressure” methods. The Controls, Experi- 
mental IV (Group Explanation), and Experi- 
mental VI (Evaluation) may be categorized 
as “low-pressure” methods, while Experimen- 
tal V (Individual Explanation) and Experi- 
mental II (Good Example) would qualify as: 


“high-pressure conditions. 
Finally, the effects of the seven conditions 


Table 3 


Means and Standard Deviations of Hedonic Ratings 


d Number of Meat Bars Consumed and Percentages 


an 
Bar in Future for Seven Conditions 


Made Sick and Intending to Use Meat 
Å 
Hedonic Rtg. Bars Consumed Made Sick Fat in Fut. 
on cee 

Condition Mean SD" Mean SD» No. Petg.° No. Petg.! 
Control 21.50 7.03 7.22 3.02 8 oa X be 
Exp. 1 (No Infl.) 21.61 6.88 5.66 3.07 5 a _ 
Exp. 2 (Good Ex.) 23.84 6.58 5.66 2.34 1 = = we 
p. 3 (Info.) 1963 746 T95 460 E 8 os 

p. 4 (Grp. Expl.) 19.21 5.90 6.75 bee x se a a 

». 5 (Ind. Expl.) 23.15 6.74 5.57 ss 4 z T 22 

p. 6 (Evaluation) 18.09 6.15 7.79 147 3 f 


a 
F rati 
Within 

e 


Conditions for homogeneity of variance satisfactorily me 
(between groups to within groups) = 541% ji. 
Using Bartlett's Test. requirements for homogeneity ol 
nouns) = 4.67, p < 01. 
hi square = 16.759; af = 6: 2 Of 
i square = 27.227; df = 6; p < 01- 


t. F between high 


f variance not satis! 


est and lowest variance = 1.608, not significant. 


fied (p <.001). F ratio (between groups to 


112 


E. Paul Torrance 


Table 4 


e ` 4 rT eS 
Means and Standard Deviations of Hedonic Ratings and Number of Meat Bars Consumed and Percentag 
i Made Sick and Intending to Use Meat Bar in Future for Each of 
Three Degrees of Perceived Pressure 


5 rat in 
Hedonic Rtg. Bars Eaten Mang ae 
ick d 
P: e ick T 
(No Eyes nfl Acts) No. Mean SD? Mean SD» % 7: 
7 a 65.8 
No Pressure (0 acts) 38 18.32 7.39 8.17 5.20 n we 
Low Pressure (1-5) 322 20.94 6.87 6.6 3.07 EA 269 
High Pressure (6 or more) 67 23.28 6.67 5.71 2.51 23: : A 
i significant 
a Requirements for homogeneity of variance satisfied. F ratio between highest and lowest variance = 1.23, not sis? 


F ratio (between groups to within groups) = 6.88, p < .01 


b Using Bartlett's Test, requirements for homogeneity of variance not met (p < .001), 


= 7.28, p < 001. a 
ia F ee 1.80; df = 2, not significant. 
4 Chi square = 15.43; df = 2, p < 001. 


are examined in Table 3 which presents the 
means and standard deviations of the hedonic 
ratings and number of meat bars consumed 
and the numbers and percentages “made sick” 
and “willing to eat the ration in the future” 
for each condition. Bartlett's Test indicates 
that conditions of homogeneity of variance 
are satisfied for the hedonic ratings but not 
for number of bars consumed. Nevertheless,? 
analyses of variance were made for both sets 
of data and in both cases the F ratios are 
significant at better than the -O1 level. Thus, 
it will be observed that on all four criteria 
the effects due to experimental conditions are 
statistically significant, since the chi squares 
‘for both “made sick” and “eat in future” are 
significant at better than the .02 and .01 
levels, respectively. On all four criteria, it 
will be observed that the “no-influence” tech- 
nique along with the control conditions occu- 
Pies a median position in terms of effective- 
ness. The evaluation, information, and group 
explanation conditions tend to be accom- 
panied by more favorable reactions, while the 
good-example and individual-explanation con- 
ditions tend to Produce boomerang effects, 
Thus, the two conditions occupying median 
positions of effectiveness were Perceived by 
trainees as being low in pressure. Two of the 
techniques producing best results occupy me- 


2? The Norton study and others (Rogers, 1942) in- 
dicate that failure to satisfy conditions of homo- 
geneity of variance is not as serious as it was once 
regarded. It is generally suggested, however, that a 
higher level of significance be required than normally 
and this has been done here. 


hit 
o within 
F ratio (between groups t 


ived in- 

dian positions on the basis of percelv pro 
structor pressure and the two condition, e; 
ducing boomerang effects are rated 
in instructor pressure, ; sa ob- 

When direct tests are applied, resu p an 
tained under the “no-pressure” copii g sig 
using hedonic ratings as the criterion a Ẹ 
nificantly better than the most effectiv ore! 
dition (CR = 2.687, p <.01) and Pop 
than the least effective one (CR= D amed 
= .06). Using the number of bars sere e 
as the criterion, the difference berme effec” 
“no-pressure” condition and the ay r 
tive condition is significant (CR = Setwee! 
rected for variance, p < .01) but not ast E 
the “no-pressure” condition and the le ade 
fective condition. Results based on » (cb 
sick” are the same as for “bars ae Heal 
Square = 6.2105, p 5 .01) and those pedoni? 
in the future” are the same as for } 7931 
ratings (chi squares = 4.3799 and H most 
respectively, for “no pressure” versu 
and least effective). f the an 

Table 4 presents a comparison 0 class? 
ceptability of the meat bar to those wis 
in the no-pressure, low-pressure, aP a 0 
pressure categories on the basis of nU jll ba 
kinds of instructor influence-acts. consi” 
observed that on this basis there is & a cept” 
ent tendency for all of the indexes of f peer 
ance to vary inversely with degree oe e? 
sure as defined here. The difference? entaS) 
statistically significant except for Pec. ple 
“made sick.” It should be noted from 
that the “no-pressure” group is ua $ 
variable on the number of bars consum! 


“No-Pressure” Influence 


Discussion 


Apparently it is difficult for anyone cast in 
a social role such as instructor to avoid influ- 
encing or being perceived as influencing role 
Partners. Whatever he does, whether so in- 
tended, tends to be interpreted as an effort to 
influence. This is probably true not only in 
the teacher-student relationship, but in such 
relationships as counselor-counselee, physi- 
Clan-patient, parent-child, salesman-client, su- 
bervisor-worker, and the like. 

It appears that members of training groups 
expect instructors to exercise influence toward 
the group as a whole and influence acts so di- 
tected tend not to be interpreted as “high 
Pressure.” Apparently such pressures tend 
to be regarded as legitimate. When “personal 
Influence” is introduced and individuals are 
Singled out for persuasive efforts or attention 
of any kind, these individuals tend to feel that 
they are being “high pressured.” A number 
of rationales for this phenomenon might be 
advanced. In the face of influence efforts, 
individuals may perceive the group as a pro- 
tection. A student singled out and ap- 
Proached individually by an instructor, feels 
Stripped of his defenses (the group) and as 
a result stiffens his resistance. 

The data suggest that pressures up to a 
Point, particularly if they are perceived as 
legitimate in terms of the influencer’s official 
role, are effective. Beyond a certain point or 
Outside the limits of what is regarded as legiti- 
Mate, however, resistance is stiffened and at- 
tempts to influence tend to boomerang. 

_ The data concerning variability are of spe- 
cial interest. Perception of “no pressure’ 
Seems to be accompanied by especially erratic 
effects. This is also true of the generally 
effective technique of giving objective infor- 
mation in a matter-of-fact manner. It is in 
this respect that the giving-information and 
evaluation conditions differ most in their ef- 
fects. This emphasizes the difficulty of gen- 
€ralizing too broadly concerning influence 
techniques. The method of influence that 
may be most effective may depend upon the 


113 


salesman and the prospect, the counselor and 
the counselee, and the like. 

Another difficulty in generalizing broadly 
from this study is that there are elements in 
the situation studied which are probably dif- 
ferent psychologically from conditions in per- 
sonal selling, counseling, and the like. It is 
believed, however, that many of the influence 
dynamics are essentially similar. 


Summary 


In this study, an effort was made to test 
experimentally the relative effectiveness of 
varying degrees of pressure exerted by in- 
structors in indoctrinating aircrewmen con- 
cerning an emergency ration known as “pem- 
mican.” The Ss were 427 aircrewmen com- 
posing 43 small training groups randomly 
assigned to one control and six experimental 
groups. Subjects were issued eight of the 
meat bars for use during the nine-day simu- 
lated survival experience. Criteria of accept- 
ance were obtained at the end of training 
along with measures of perceived instructor 
effort to influence. It was found that instruc- 
tors were relatively unsuccessful in exercising 
“no influence” insofar as trainee perceptions 
are concerned. When the seven conditions 
were arranged in order of perceived instruc- 
tor pressure, it was found that pressure up to 
a certain point appears to be accompanied 
by increased acceptability and that beyond 
this point influence efforts operate in an in- 
verse direction to that intended. Those who 
perceive “no effort” to influence them, tend 


to react most favorably. 


Received July 1, 1958. 


References 
Bursk, E. C. Thinking ahead: Drift to no-pressure 
selling. Harvard Bus. Rev, 1956, 34, 25-324. 


Lindquist, E. F. Design and analysis of experiments. 


Boston: Houghton Mifflin, 1953. 
Rogers, C. R. Counseling and psychotherapy. Bos- 


ton: Houghton Mifflin, 1942. T 
Torrance, E. P., & Mason, R. The indigenous leader 
in changing attitudes and behavior. Int. J. Socio- 


metry, 1956, 1, 23-28. 


Journal of Applied Psychology 
Vol. 43, No. 2, 1959 


PREFERENCES FOR LETTERS OF THE ALPHABET 


MICHAEL MECHERIKOFF 
Westmont College 
And DAVID L. HORTON 


University of Minnesota 


This study is an attempt to answer the 
questions: “Do consistent letter preferences 
exist?” and “If preferences exist, which let- 
ters may be considered approximately equal 
to each other in appeal?” Interest in these 
questions arose when a large food processing 
company found that the same product pack- 
aged in containers differing only in label 
yielded large and significant preferences. 

A direct attack on the question of whether 
or not letter preferences exist does not ap- 
pear to have been made, although many stud- 
ies have been done to investigate preferences 
in other areas. Closest to the present prob- 
lem are studies of number preferences, e.g., 
Yule (Chapanis, Garner, & Morgan, 1949; 
Yule, 1927) and a study by Forer (1940) 
which deal with preferences for sounds of 
consonants. Although studies of this kind 
lead one to suspect the existence of letter 
Preferences of the type in which we are in- 
terested, they are of no value in assessing 
which letters are preferred in situations simi. 
lar to that of the food processor, or in deter- 
mining which letters may be considered equal 
for purposes such as this. 


Method 


equality of preference two Preliminary studies were 
done, and seven letters which showed the least evi- 
dence of preferences were then used in paired com- 
parisons. 

In the first study the Ss were told a story about a 
mythical community in which the inhabitants named 
their children after letters of the alphabet. The Ss 
were asked to “name a new baby” by indicating the 
five letters they thought most appropriate, the five 
next most appropriate, the five least appropriate, the 
five next least appropriate, and the six remaining. 
For each letter a graph was made showing its fre- 
quency in each of the five categories. 


In the second study, 80 students in an advance 
Psychology course were asked to rank order 3 
alphabet. Half of them were asked to rank ae 
ing to the way the letter looked, the other according 
to the way the letter sounded. The degree of oe 
cordance among students on each of these tasks ie 
low. The rank difference correlation between tes 
mean ranks on the two lists was 50, which mae 
that there is only moderate agreement when differ 
criteria for judging are used. 

The seven letters for paired comparison Wer 
lected as follows: 

1. If a letter appeared among the middle 
ters of both the “sound” and “looks” lists, p 
included. This gave T, P, N, and K, and all Er 
K appeared neutral from the first preliminary 5 
as well. poth 

2. The letters S, V, and G were neutral on and 
the first preliminary study and the “sound” list 
so were included. natio 

The seven letters gave 21 pairwise comhra 
which were mimeographed in capitals on shee jist 
83X54 paper. S's task was to go down the re- 
rapidly and in cach pair circle the letter Be 
ferred. This list of pairs was randomized, an 
changes were made in the random order 50 “r 
avoid the appearance of the same letter in sr 
sive pairs as much as possible, and so that a 
ter appeared in right and left positions an were 
number of times, Eight forms of the basic list 
used to control three other variables. ential 

1. Straight-Reversed. To control for prefer the 
treatment of the beginning or end of the lis! A of 
order of the entire list was reversed for one-ha! 
the Ss, 

2. AB-BA. To control for right-left pref i 
the position of the two letters was reversed ha 
the time. i 

3. Split-Nonsplit. To equalize the oe 
treatment of pairs appearing at the extremes © ipe 
list versus the middle, the relative position ° 
two halves was changed for half of the ease: ran" 

The eight forms were distributed to the $ wer? 
domly. The 182 (138 males, 44 females) S gurs? 
students taking an introductory psychology 
at the University of Minnesota. 


g Se 


10 let- 
se was 
it pt 


ns; 


som 
s 
ces” 


Sy 
enc? 
ere of 


jal 
ptis 
renti 


Results 


i 

In order to determine whether letters : 

in preference, significance tests were pet! é 
both for letters within pairs, and for 


“et 
ie 


114 


Preferences for Letters of the Alphabet 


Table 1 


Proportion of Individuals Choosing Letter on Left 
When Paired with Letter on Top 


G K N P S T 


K .40** 

N 50* 58* 

P 58* Sd .54 

a g o = 50" 46h 

T 48 sgt 32 46 34 
y: „55 152 45 52 42* 53 


ici 
Sisnificant at the 5% level. 
Significant at the 1% level. 


letter when considered across all the pairs in 
which it appeared. Table 1 and Table 2 pre- 
Sent these data. In both cases the deviation 
of the observed proportion from the hypothe- 
Sized value of .50 was tested for significance, 
ea of the pairs showing preferences at the 
Yo level. When tested irrespective of any 
Particular pairing, one letter (S) was sig- 
nificantly preferred at the 1% level, and one 
a was significantly nonpreferred at the 5% 
evel. It should be noted that the test for 
Significance uses an M of 1092 providing the 
Measures are independent. Since the 1092 
Measures were not independent but appeared 
in related groups of six, M = 182 was used to 
test significance, which is conservative. The 
Tesults indicate that there are stronger pref- 
erences for some letters than for others, at 
least when the letters are presented in pairs. 

Since some preferences are revealed in this 
analysis and since the letters used here were 
expected to yield the smallest differences, it 


Table 2 


Ratio of Number of Times Each Letter Was Selected to 
the Possible Times it Could be Selected 


= 


t15 


seems likely that the other letters of the al- 
phabet would yield differences as great or 
greater (although this does not preclude find- 
ing pairs of equally liked or equally disliked 
letters). 

Analyzing for sex differences and positional 
preference, the pair NK showed a significant 
sex difference at the 1% level, and the pairs 
KP, VK, and KT were significant at the 5% 
level. The letter K was preferred by more 
women than men, and this female preference 
for K holds in the other two pairs involving 
K, although it is not statistically significant. 
Preferences of this variety would be of im- 
portance when only one sex is judging the 
product. No significant positional prefer- 
ences were discovered. 


Selecting Equal Pairs 


In problems where one is interested in de- 
tecting whether or not a difference really 
exists, the significance level is of central im- 
portance. But if one wishes to establish with 
a high probability that the difference is not 
larger than a certain amount (which is deter- 
mined by practical considerations), then the 
power of the test is of primary concern. The 
method of computing power is described by 
Walker and Lev (1953). 

For the present study it was decided that 
if the proportion of Ss preferring one letter 
over another fell between .45 and .55, these 
letters would be considered equal in appeal. 
With sample size and limits of allowable de- 
viation fixed, the only way to increase the 
power is to change the significance level. 
Figure 1 indicates the relationship between 
power and significance level when NV = 182 
and the allowable deviation from p= .50 is 
09: 

For this study it was decided that the 
power of the test should be approximately 
The power was computed at the point 


=a nt ne ‘45 and p= .55 (the power is the same 
Mi ot at both of these points because of symmetry). 
T A Tf the true proportion is greater than .55 or 
i = less than .45, the power will be greater than 
= e it is at these points, and detection of the 
a S falsity of the null hypothesis will be even 

P 49 more probable. 
The question of equality of letters in pairs 


* Signi = 
ay SMificant at the 5% level. 
Significant at the 14 level. 


can now be considered specifically. When the 


118 


has answered the inventory items. Anyone having 
a high falsification or lie score will be in a less 
favorable position to get the occupation he wants, 
than those who mark the inventories truthfully.” 
Condition 3. Before reading the standard directions 
to the Ss they were informed that it was planned 
to use one or both of the inventories in a new train- 
ing program with which the Ss were familiar. It 
was explained that information was needed as to how 
much the two inventories could be faked. An incen- 
tive, consisting of assignment to training for the 
occupation of their choice, was offered for the five 
Ss who made the highest interest score on their pre- 
ferred occupation and who presented themselves in 
the most favorable light from the standpoint of 
personality scores. It was stipulated, however, that 
they must meet minimum aptitude requirements for 
the desired training in order to get the assignment. 


Each of the three basic conditions was further 
divided into two subconditions. In one subcondition, 
the interest inventory was administered first and the 
personality inventory second. Under the second sub- 
condition, the order of administration of the two 
inventories was reversed. In the treatment of the 
data by analysis of variance the effects due to order 
were removed, 

Prior to giving the directions for taking the inven- 
tories each of the groups was asked to fill out an 
Occupation Preference Sheet on which the Ss indi- 
cated their choice of the 12 Naval Aviation occupa- 
tions. This sheet also contained an item which 
inquired into their degree of preference for the 
occupation selected. On the basis of this informa- 
tion the basic experimental groups were further 
atly preferred 
a certain occupation and those who indicated a less 
decided preference for the occupation of their choice. 

Of the 773 men who took the inventories, 581 indi- 
cated that they would prefer one of the following 
engine mechanic, structural 
mechanic, ordnanceman, electronics technician, elec- 
Interest inventory keys de- 


George D. Mayo and Isaiah Guttman 


” 
number, 426 indicated that they “greatly preferred 
the rating they indicated on the Occupation re 
ence Sheet and 155 indicated a less intense pelr 
For purposes of the study the 426 men who A ea 
preferred” one of the ratings were considere 
more highly motivated group occupationally iie 
the 155 who marked the other alternatives, er 
scores made by the two groups were analyze i 
rately. The first group will be referred to 
“greatly prefer” group and the other group 
“non-greatly prefer” group. sonal keys 

The interest scores for the six occupation 


and 
Were standard scores haying a mean of A group 
standard deviation of 10. The standardization £ man 


Was a group of approximately 2000 na iten 
Preparatory School students who had taken an 
est inventory previously. The present da 
collected in the same Airman Preparatory s 
At the conclusion of this five weeks course trai 
dents were assigned to more specialized occuP* 
Which leads to one of the Naval Aviation 


stu- 
ning 
of 
tions. The Ss were members of three classes, * e in 
which was in the second week of training 
the third week, and one in the fourth wee he See 
class contained 12 instructional sections. rte 
tions, which were filled as the men repo naition’ 
training, were assigned to the experimental aa i 
in a manner designed to avoid any Seur ip- 
that might be associated with class membe 


Results 


If both of the hypotheses under we? 
tion were correct the lowest mean Sofe 
inventory score, and an elevated F dition 
the MMPI, should occur under Con fi 
that is, knowledge that a falsificatio 
would be computed. The second lowes 
on occupational interest should ai 
Condition 1, standard directions, r alone 
highest score on occupational interest: rve! 


sider 
rest 


e 
these occupations or for : Id be ob er? 
similar nonaviation Navy occupations were used in with an elevated L SEOKE, snou the Ss w as 
scoring the interest inventory, These 581 students Under Condition 3, in which muc 
constitute the sample used in the study. Of this directed to fake the inventory aS 

Table 1 


Comparison of Interest Score: 
“Non-Greatly P 


s on Preferred Occupation of 
refer” Groups Under Three Conditions 


“Greatly Prefer” and 


“Greatly Prefer” Group 


“Non-Greatly Prefer” Group 


Condition 


N 


ss. ae . i Mean SD N Mean SD gat, 
Standard:Direstions 146 6454 850 52 6100 7.55 We 
Knowledge of Falsification Score 145 63.76 8.53 50 59.46 8.24 4 
Directions to Fake 135 64.49 9.30 53 61.28 9.29 


* 05 level of confidence. 
** 01 level of confidence. 


Faking in a Vocational Classification Situation 


119 


Table 2 


Comparisons of MMPI F and L Scores Made Under Different Conditions, Effects Due to 
Order of Administration of the Two Inventories Removed 


“Greatly Prefer” Group 


“Non-Greatly Prefer” Group 


Condition N Mean SD F N Mean SD F 
MN 0 
IPI F Score Standard Directions 146 2.90 241 7.40 52 434 621 NS 
Knowledge of Falsi- . rene 
fication Score 145 3.72 2.68 50 3.98 2.90 
M) i 
MPI L Score Standard Directions 146 4.64 2.33 6.64* 52 4.35 2.25 6.55* 
Directions to Fake 135 542 2.88 545 2,12 


* 
es 5 level of confidence. 
+01 level of confidence. 


P ap _The reverse would be true of scores 
Ns clinical scales of the MMPI, as mal- 
ath ment was considered to be associated 
high scores. 

he ee the results of comparisons in- 
clinical interest inventory scores and MMPI 
ena scale scores supported neither of the 
Fea In the comparison of the three 
Shera: mia groups on the basis of interest 
individ or the occupation preferred by each 
foma 2a no significant differences were 
for th or either the high preference group OF 
Stace, low preference group. Means and 
rites, are presented in Table 1 (the F test 
this S shown in the table do not relate to 
ane but instead to another com- 
+ ee that will be described in the next 
oe Neither did the three basic 
kar De oe significantly on any of the clini- 
the re es of the MMPI? But, as shown in 
outs test comparison in Table 2, MMPI F 
the scores were significantly higher within 
s dn prefer group under Condition 2, 

nowledge that a falsification score would 
a Computed, and L scale scores were signifi- 
antly higher under Condition 3, instructions 


t k 

te fake, for both preference groups, when 

Ompared with scores made under the condi- 
for the 


ti 

thin of standard directions (scores 
rd, nonhypothesized condition in both the 
scale and L scale comparison did not differ 


for each of the MMPI 
h the American Docu- 
o. 5834, re- 
$1.25 for 6 


2 
ta 2-page table giving data 
entatinn been deposited wit 
ittin, ion Institute. Order Document N 
ie $1.25 for 35-mm. microfilm or 

n. photocopies. 


significantly from the condition of standard 
directions). Unlike the general trend of the 
findings, these results are in accord with the 
hypotheses. 

An additional comparison aids in the inter- 
pretation of the results and is also of interest 
in its own right. As shown in the previously 
mentioned F test comparison in Table 1, men 
who indicated on the Occupation Preference 
Sheet that they greatly preferred the occupa- 
f their choice made higher mean scores 


tion 0 
ry than did men who 


on the interest invento 
did not greatly prefer the occupation of their 


choice. This occurred despite the fact that 
both groups were well above the mean for 
Navy men in general. These results appar- 
ently reflect the validity of the interest inven- 
tory and are in accord with more direct evi- 
dence concerning the validity of the inventory 
(Clark: 1949, 1956; Clark & Gee, 1954; 
Mayo & Thomas, 1956). 


Discussion 


Summarizing the results briefly: first, no sig- 
nificant differences were found between inter- 
est inventory scores or between MMPI clinical 
scale scores that could be attributed to the 
experimental conditions. Second, such differ- 
ences were found in the case of the MMPI 
F scale and L scale. Third, interest inventory 
scores were significantly higher for men who 
indicated that they greatly preferred a given 
occupation as compared with scores made by 
men who expressed a milder preference for 


the occupation of their choice. 
An interpretation of the data that was con- 


120 


sidered was that all three experimental groups 
made the best score they could on both inven- 
tories, that is, faked them as much as possible 
and hence made essentially the same mean 
scores on the interest inventory and the clini- 
cal scales of the MMPI. But this interpreta- 
tion encounters difficulty in explaining why 
the Z scores of the group that was instructed 
to fake were higher than those of the groups 
that were not instructed to fake. Neither is 
the finding easily handled that the men who 
did not greatly prefer their occupational 
choice had lower interest scores but essentially 
the same MMPI L scores as the men who 
greatly preferred the occupation of their 
choice. Furthermore, mean ZL scores were 
moderately low, less than a raw score of $; 
in the groups that were not instructed to fake. 

A more likely interpretation, and one which 
is believed to be compatible with all the 
findings, is that under the present design there 
was no marked, systematic tendency for the 
interest inventory and the clinical scales of 
the MMPI to be faked successfully under any 
of the conditions, although the evidence points 
to an attempt to fake them under Condition 
3. It is suggested that the generality of 
this interpretation be considered as tentative 
and subject to change in the light of further 
evidence collected in vocational classification 
situations. It should be made explicit that 
no claim is made that the two inventories 
were not faked to some extent. Further, at- 
tention is again drawn to a special condition 
in the design of the study, namely, that indi- 
viduals with an already validly high degree 
of interest in an occupation were being tested. 
This makes the demonstration of faking as 
indicated by even higher interest scores fairly 
difficult, but is, in the writers’ opinion, closer 
to a real-life situation than are the classical 
experimental designs concerning faking. It 
would appear that the design did provide an 
adequate opportunity for any large scale fak- 
ing to be detected, and the results failed to 
substantiate that this occurred. The results 
concerning the effect of providing information 
that a falsification score will be computed are 
interpreted to indicate that the procedure 
either is not needed with these two inven- 
tories or that it is not effective under the 


conditions of this study. 


George D. Mayo and Isaiah Guttman 


Summary 


Two hypotheses concerning faking on an 
interest inventory and a personality inventory 
were tested under the real-life motivation con- 
ditions of a military occupational classifica- 
tion situation. Generally, the results failed 
to support either the first hypothesis—that 
knowledge that a falsification score would be 
computed would result in less favorable mean 
inventory scores, thereby indicating less fak- 
ing—or the second hypothesis—that direc- 
tions to fake the inventories, accompanied by 
an appropriate incentive, would result in more 
favorable mean inventory scores, indicating 
more faking. 

Significantly higher MMPI “lie” Z scores 
were made under directions to fake m 
inventories, and significantly higher MMP. 
“validity” F scores were found in one © 
the two groups under the condition involving 
knowledge that a falsification score would b¢ 
computed. The men who said they great Y 
preferred a certain aviation occupation m4 
higher mean interest inventory scores 0n ae 
occupation they preferred than did men pas 
expressed a less intense interest in their a 
ferred occupation. The data do not forc 
the conclusion that faking on the two S¢ 


* . . RR jona 
report inventories in a military Toe Ae 
classification situation is minimal, but it 

tunity 


experiment provided a favorable oppo" i 
for faking to manifest itself and it was 
observed to any appreciable extent. 


Received July 17, 1958. 


References 


zilled 

Clark, K. E. A vocational interest test at the ia, 

trades level. J. appl. Psychol., 1949, 33, 20" ond! 
Clark, K. E. Manual for use of the Navy Vor nivel 

Interest Inventory, navy airman version. 

of Minnesota, 1956. dem for 
Clark, K. E. & Gee, Helen H. Selecting ite 954 

interest inventory keys. J. appl. Psychol. 

38, 12-18. gude! 
Cross, D. H. A study of faking on the 1950 

Preference Record. Educ. psychol. Measmt. 

10, 271-277. aces Whee 
Gordon, L. V. & Stapleton, E. S. Fakability nigh 

forced-choice personality test under realistic ols 

school employment conditions. J. appl. Psy 

1956, 40, 258-265. » aesot 
Gough, H. G. Simulated patterns on the Miono 

Multiphasic Personality Inventory. J- 4% 

soc. Psychol., 1947, 42, 215-225, 


—— SS T 


Faking in a Vocational Classification Situation 121 


Heron, A. The effects of real-life motivation on 
questionnaire response. J. appl. Psychol, 1956, 
40, 65-68. 

Kelly, E. L., Miles, C. C., & Terman, L. M. Ability 
to influence one’s score on a pencil and paper test 
S personality Charact. & Pers, 1936, 4, 206- 

Kimber, J. A. M. Insight of college students into 
the items on a personality test. Educ. psychol. 

_ dMeasmits 1947, 7, 411-420. 

Ongstaff, H, P. Fakability of the Strong Interest 
Blank and the Kuder Preference Record. J. appl. 
Psychol., 1948, 32, 360-369. 


Mayo, G. D. & Thomas, D. S. Agreement between 
counselor-counselee vocational decisions and inter- 
est inventory scores. Personn. Guid. J., 1956, 35, 
37-38. 

Noll, V. H. Simulation by college students of a pre- 
scribed pattern on a personality scale. Educ. psy- 
chol. Measmt., 1951, 11, 478-488. 

Rabinowitz, W. The fakability of the Minnesota 
Teacher Attitude Inventory. Educ. psychol. 
Measmt., 1954, 14, 657-664. 

Steinmetz, H. L. Measuring ability to fake occupa- 
tional interest. J. appl. Psychol., 1932, 16, 123- 


130. 


Journal of Applied Psychology 
Vol. 43, No. 2, 1959 


USE OF TELEVISION FOR REMOTE CONTROL: 


A PRELIMINARY STUDY? 


ROBERT L. MARTINDALE ax» WILLIAM F. LOWE 


Air Force Special Weapons Center 


Future air weapon and space flight systems 
are expected to require a wide range of re- 
mote control and manipulation activities. In 
many of these cases direct sensory contact 
with the remote field of activity will not be 
possible. Closed circuit television has been 
suggested as a simple means to provide visual 
feedback under such conditions. Unfortu- 
nately, it has not been commonly realized 
that the use of television may systematically 
alter the visual field. Such alterations can 
produce movements in the visual field that 
conflict with the movements an operator 
might expect from his motor performance. 
An earlier study demonstrated the disruption 
of performance that may result when the 
visual field of direct motor behavior is sys- 
tematically altered through the use of tele- 
vision (Smith, Smith, Stanley, & Harley, 
1956). 

The present study is a preliminary experi- 
mental test of the use of television in a re- 
mote performance situation. This was a sim- 
ple cyclic task performed by means of a sim- 
ple extension of a motor end organ. Two 
general questions were posed. Does accuracy 
of performance vary when the visual field 
is systematically varied? Secondly, does the 
accuracy of performance improve when the 
visual orientation is normalized but the pro- 
prioceptive cues remain systematically al- 
tered? It is reasonable to assume that any 
deterioration in the accuracy of performance 
demonstrated with the simple cyclic task of 
the present experiment might be even more 

1 This experiment was carried out in the Human 
Factors Division Laboratories, Research Directorate, 
Air Force Special Weapons Center, Kirtland Air 
Force Base, Albuquerque, New Mexico under Air 
Force Project 1811. Edward S. Halas assisted in 
running the Ss. 

Permission is granted for reproduction, translation, 
publication, use and disposal in whole and in part 
by or for the United States Government. The opin- 
ions expressed in this paper are those of the authors 
and do not necessarily reflect the views or have the 
endorsement of the U. S. Air Force or U. S. Govern- 
ment. 


122 


pronounced with the irregular tasks pe 
would be more typical of a practical remo 
control situation. 


Method 


Fifteen male right-handed Air Force officers wen 
utilized as Ss. The task required § to follow & ae 
suit rotor target with a stylus while viewing 7 
rotor turntable and stylus tip in a 17 in. black s 
white television monitor screen. Each S was ane 
in a chair before a 6 in. turntable which revo The 
in a counterclockwise direction at 1 rpm 
surface of the turntable was positioned i à 
zontal plane 23 in. from the floor. The targ 
i, in. diameter and revolved at a constant ra! 
1 in. around the turntable center, The stylus 
in the vertical plane, but was rigid in the hori 
Both turntable and stylus tip were obscure 
S’s direct vision. he turn- 


! 
t 
f 
A television camera pointed down toward e woul 
fi 


zonta” 
n 


table at 45° angle which approximated wha pscure 
have been S’s normal line of sight to the uno in 

turntable. The visual field was displayed * 
horizontal plane by positioning the camera od pro- 
trated in Fig. 1. The angular displacemer = 
duced of the visual ficld around the turntable (C) 
S’s normal unobscured line of sight were (A) 
and 90° (B) to S’s right and 90° (D) and 17> 


MONITOR 


(A),(B),(C),(0) 


CAMERA 


“| 


TURNTABLE | ~<——— 


CAMERA CAMERA 
(0),(E) (B) 


Sous 
(c) 


SUBJECT'S NORMAL 
UNOBSCURED LINE 
OF SIGHT om 
á i e, 
Top view diagram of S, turntable, 
and monitor positions. 


Fic. 1. 


Use of Television for Remote Control 


123 


Table 1 
Analysis of Variance of the Time on Target Scores 
ameumi Sum of Mean 
ce of Variance df Squares Square F 
Total 
74 111,239 
Order of condition: f i 
c s 4 20,002 5,0 
Scan between individuals 10 9,951 pees nail 
= between individuals 14 29,953 
e conditions 4 60,408 15,102 38.82** 
- s 4 668 167 ae 
; esidual from latin square 12 7,166 597 1.83 
Residual within latin square 40 13,044 326 
4g ‘ooled error 52 20,210 389 
otal within individuals 60 81,286 


*or 
ap Significant at 5 A 


ignificant at 1% 


level of confidence. 
level of confidence. 


to §) 
5 S left, " 
it The monitor screen was positioned with 


gy center 175° 


er idi 
ons, Sg Povh four experimentation 
Dositionin sa condition (E) was pro- 
Š ee camera as in condition (D) 
© the Seer positioned 90° around the 
Monitor Basile t of S’s normal line of sight. 
tit Moan presented an image of identi- 
with pier larger than would have been 
ter of the nes vision of the turntable. 
ded th. from a urntable image on the monitor 
# $ eves und e floor and approximately 80 in 
k 3-min, tri = both monitor conditions. i 
r each ar fhe was given to each of the 15 Ss 
i rial interval five experimental conditions. The 
Thess Eet uring was approximately 4 min. Time 
u time ee trial was electrically recorded. 
umns, vith 3 Beplan analyzed in a 5X5 latin 
Condit E square, ee Trials constituted col- 
tae ere orders, and cells were 


cal size 
pected 
he ce 
Was 2 
fro 


S was 
ance Pea instructed as to his required 
a brief aoe experimental session and E 
pean onstration. Before each trial, S 
miontior on the stationary target through 
The gars Otherwise, no practice 
‘teal h urntable and timer were started 
Condition was on target and ready. In 
qe Me S was instructed to main- 
exes a ation to the task but to turn 
ight to view the monitor screen. 


Results 


the dble 1 
Et 
men me one an analysis of variance of 
home’ for len These data meet the require- 
tig Senei Mogeneity of variance and for 
Ns, y betwe i i 
Sear en latin square replica- 
ore, significant F values are at- 


tributed to differences between means and 
the combined analysis is considered justified. 
Differences between experimental conditions 
and between orders of presentation of the 
conditions exhibited the only significant F 
value. The trials (columns in the latin 
square) did not exhibit a significant F value. 

Table 2 lists the means for each experi- 
mental condition. A mean difference of 19.3 
is required for significance at the 1% and 
14.5 at the 5% level of confidence based on 
the multiple ¢ test or “Jeast significant differ- 
ence.” The estimate of error variance is the 
“pooled error” mean square of 389, each mean 
is based on 15 observations, and ¢ is the 1% 
two-tailed point from Student’s ¢ tables with 
52 df. Therefore, the mean of Condition A 
differed from B, D, and E; , B, and 
E; and E from D at less than the 1% level. 
Condition E differed from B at the 5% level. 

Table 3 lists the orders of presen 


Table 2 


on Target per Trial for 
mental Conditions 


mæ- 


= Mean Time 


Experi! 
Experimental Mean Time 
Condition (sec.) 
A 117.1 
B 53.9 
C 102.7 
D 42.9 
E 69.4 


124 


Table 3 


Mean Time on Target per Experimental, Session 
for Orders of Presentation of Experi- 
mental Conditions 


Mean Time 
Order of Conditions (sec.) 
I (A) (B) (C) (D) (E) 316.7 
1m (B) (C) (D) (E) (A) 459.0 
Tr (C) (D) (Œ) (A) (B) 270.0 
IV (D) (E) (A) (B) (C) 400.3 
v (E) (A) (B) (C) (D) 483.7 


conditions and the respective mean for each 
order. A difference of 81.6 was required at 
the 1% and 57.4 at the 5% level of confi- 
dence based on the multiple ¢ test. The esti- 
mate of error variance is the “residual be- 
tween individuals” mean square of 995, each 
mean is based on 3 observations, and ¢ is the 
applicable two-tailed point from Student’s t 
tables with 10 df. Therefore, the mean of 
Order III differed from II, IV, and V; I from 
II, IV, and V; and V from V at less than the 
1% level. Order IV differed from II at less 
than the 5% level. 


Summary and Conclusions 


Different systematically displaced televised 
performance fields produced marked differ- 
ences in performance accuracy on a simple 
cyclic motor task. Average time on target 


Robert L. Martindale and William F. Lowe 


ranged from about å of the trial for the best 
condition to less than } for the poorest con 
dition. The best performance occurred with 
the visual field displaced 175°, i.e., with an 
approximately inverted image, and the worst 
performances with 90° displacements. 
marked improvement was noted when the 
90° displaced field was normalized by TeP® 
sitioning the monitor screen, but the prop!” 
ceptive cues remained 90° out of phase m 
Condition (E). The significant differences 
between orders of experimental conditions 
undoubtedly indicate a practice effect. How 
ever, the present design does not Jend itse 
to an analysis of these differences which may 
be due in part to an artifact introduced 
the systematic diagonal Latin square that was 
employed. a 
The usefulness of closed circuit televisio® 
as a means to provide visual feedback f0" |” 
remote performance field appears to be oe 
ously limited when the visual field 1$ ee 
placed. This limitation can be partially 0% 
come by repositioning the monitor screen al 
the operator’s visual field in such a man” 
as to compensate for the camera displacem®e” ' 


Received July 16, 1958. 


Reference 


Smith, W. M., Smith, K. U., Stanley, R, & # 
W. Analysis of performance in televise skills 
fields: Preliminary report. Percept. mot. 

1956, 6, 195-198. 


ae - 
5 e lle 
a -O 


Journal of Applies 5 
Vol. 43, a ppa oe) chology 


OPTIMAL INTERVAL 


LENGTH FOR VISUAL 


INTERPOLATION: 


THE EFFECT OF Vv 


JEWING DISTANCE* 


A. V. CHURCHILL 


Defence Research Medical Laboratories, 


In situations involving visual displays it is 
oie prs assumed that the “law” of the visual 
d ste is applicable, i.e that an increase oF 

€crease in viewing distance must be accom- 
m = by a proportional increase or decrease 
an e dimensions of the display and thus 
a a constant visual angle. This as- 
in i has led to the recommendation that 
be esearch on visual displays the standard to 
ge when specifying the stimulus 
cs lable of size is «yjsual angle in degrees 
al actual dimensions— (provided distance is 

so given)” (U. S. Armed Forces, 1950). 
(Ch recent study of visual interpolation ° 
ah urchill, 1956) disclosed a trend towards 
to una interval length when interpolating 
S of a scale interval from a viewing 
PT of 28 inches. Subsequent trials 
incl viewing distances of 56 inches and 84 
aie with the same displays, yielded re- 
of s which suggested that the optimal length 
ioe RAI was independent of viewing dis- 
t ce, tending to be constant over the dis- 

ances tested. 

oe experiments reported her 
int en to determine (4) the optimal length of 
th erval for interpolating in tenths, and 

the effect of viewing distance 0n the optimal 
Interval length. 


re were under- 


speri t 1 
Method Experimen! 


Apparatus. The apparatus has been described else- 


Taer (Churchill, 1956). Seven horizontal scale in- 
tvals (0.25, 0.5, 0.75, 1.0, 1.5) 2% and 3.0 inches 


fant were used. A “O” appeared above the scale 
cs at the left extremity of the interval, and a 
above that to the right. The separation be- 


x . 
N Defence Research Medical Laboratories Report 


9; 164-8, PCC No. D77-94-20-27, 55. 
of Ge visual interpolation is meant the estimation 
2-3. Position of the pointer at unit, positions, 1. 
e ees between the two boundaries. 
“jop treme values of the scale interval, 1- 
estim: Errors of interpolation are defined as those 
ations which are incorrect. 


Toronto, Canada 


tween the pointer tip and the horizontal scale line 
was 0.125 in. in the plane of the scale. Viewing dis- 
tances were 28, 56, and 84 in. Display brightness 
was 120 footlamberts for all conditions. 

Procedure. Twenty-four laboratory personnel 
served as Ss. The six orders of presentation of the 
three viewing distances were each used four times 
and assigned to the Ss randomly. Scale intervals 
were presented in random order at each viewing dis- 
tance. Eighteen settings, two at each pointer posi- 
tion from 1 to 9 in random order, constituted a trial 
for each interval. Exposure time was 0.5 sec., with 
an interexposure period of 4 sec. for S’s response. 
The procedure was repeated, in a different random 
order, before changing the viewing distance. Since 
interpolating from the shorter scale intervals at the 
longer viewing distances presented a difficult visual 
problem, the 0.25-in. interval was not presented at 
56 in. nor were the 0.25-in. and 0.5-in. intervals 
presented at 84 in. 

Ss were instructed in the task and shown sample 
scales before beginning the trials. Interpolations 
were reported to the nearest unit. A “Ready” signal 


preceded each trial. 


Results 


The data are pr 
Part A of Fig. 
val length on inte 
optimal interval o 
ing distance. The effect of viewin 


esented graphically in Fig. 1. 
1 shows the effect of inter- 
polation accuracy, with an 


f 1.0 in. at the 28-in. view- 
g distance 


H 


è 
A 


PERCENTAGE OF INTERPOLATIONS IN ERROR 


a Dran a 


zo a 
VISUAL ANGLE (DEGREES) 


o 


30 


to zo 
SCALE INTERVAL LENGTH 
(INCHES) 


Scale interpolation errors as a function of 
length, and (B) visual angle, at three 


viewing distances. 


Fic. 1. 
(A) interval 


125 


—-- 


126 


is shown by an optimal interval of 1.5 in. at 
the 56-in. viewing distance and 1.0-1.5 in. at 
the 84-in. viewing distance. Part B of Fig. 1 
presents the same data in terms of visual 
angle. 

It is apparent from these data that a scale 
interval length of 1.0-1.5 in. generates a 
minimum number of errors of interpolation, 
regardless of viewing distance. The fact that 
the three curves in Part B of Fig. 1 do not 
constitute a single curve is interpreted to 
mean that the “law” of the visual angle is not 
applicable to the displays and conditions un- 
der consideration here. 


Experiment 2 


In Experiment 1, the dimensions of the 
component parts of the intervals—line thick- 
ness, pointer dimensions and numeral size— 
were constant, interval length being the only 
factor varied. Consequently, the component 
parts (pointer, digits, etc.) subtended smaller 
visual angles at greater viewing distances. In 
Experiment 2 the dimensions of these parts 
were kept proportional to variations in in- 
terval length, i.e., the dimensions subtended 
the same visual angles at the different view- 
ing distances, 


Method 


Apparatus. A horizontal scale interval, with “0” 
at the left and “10” at the right, was photographed 
with the pointer in turn at each of the 11 positions 
from “0” to “10” and sets of black on white slides 
were made from the photographs. Approximate di- 
mensions of the component parts of a projected 1-in. 
interval were: horizontal scale line, 1.0 X .03 in. 
wide; vertical scale marks at the extremities, .20 X 
:03 in. wide; numerals “o” and “10, .10 in. high; 
pointer, .56 X .08 with a tip of .03 in. wide; sepa- 
ration between pointer tip and horizontal scale line, 
.08 in. in the plane of the scale, The projection ap- 
paratus was such as to permit variations in interval 
length to be accompanied by proportional variations 
in the dimensions of all component parts. Display 
brightness was 10 foot-lamberts for all conditions, 

Procedure. Nine combinations of interyal length 
and viewing distance (0.5-in. and 1.5-in, intervals at 
28 in.; 0.5, 1.0, 1.5, and 3.0 in. at 56 in.; 0.5, 1.5, 
and 4.5 in. at 84 in.) were presented in random or- 
der to each of the five laboratory personnel who 
served as Ss. Following a five-minutes rest the pres- 
entations were repeated in a different random order, 
The intervals were selected to permit a comparison 
of two interval lengths (0.5 and 1.5 in.), and two 


A. V. Churchill 


visual angles (approximately 1 and 3 degrees) at the 
three viewing distances. 

A trial consisted of 20 exposures. The first two at 
pointer positions “0” and “10” respectively oriented 
S while the remaining 18 were a randomized series 
of pointer positions 1 to 9, twice each. Exposure 
time was .25 sec., with an interexposure period of 
4 sec. for S’s response. The procedure was repeated 
one week later. 


Results 


Data were tabulated in percentage of inter- 
polations in error, and transformed to degrees 
(8 = sin* Vp) to satisfy the assumptions of 
analysis of variance (Quenouille, 1950). Re- 
sults of the analysis of the nine display con- 
ditions, for the two days, are presented in 
Table 1. 

Table 1 shows an error term of the same 
order of magnitude as the theoretical residual, 
indicating that the performance of Ss is con- 
sistent from trial to trial within the same day- 
The significant S x D interaction shows that 
the over-all level of performance changes from 
day to day in a manner dependent on indi- 
vidual Ss. The absence of inflated interac- 
tions involving conditions, indicates the re- 
producibility of performance under these con- 
ditions across Ss and Days. 

The means of percentages of interpolations 
in error (retransformed from degrees), for 
the nine conditions, are shown in Table 2. 

Table 2 shows that with the smaller visual 
angle (1 degree), errors decrease as viewing 
distance is increased. With the larger visual 


Table 1 


Analysis of Variance of Nine Combinations of 
Interval Length and Viewing Distance 


Mean 

Source df Square 
S (Subjects) 4 839.5* 
C (Conditions) 8 1782.4" 
D (Days) 1 9.7 
CXD 8 38.4 
SXC 32 52.8 
SXD 4 287.2* 
SXCXD 32 53.0 
Error 90 37.3 
Total 179 


*p <.01. Theoretical Residual 45.6. 


~ = 


9 —-. 


Effect of Viewing Distance 127 


Table 2 


Means of Percentages of Interpolations in Error 
(Based on 18 Observations X 2 Trials X 2_ Days 
X 5 Subjects) 


Viewin, r i 

Distant Scale Interval Length (inches) 

(inches) 05 10 15 30 45 
28 37 = g - = 
56 357 148 85 12.0 — 
84 46.2 94 — 237 


angle (3 degrees), errors decrease as viewing 
distance is decreased. It is to be noted that 
m both comparisons the 1.5-in. interval is 
Optimum, although it subtends the smaller 
angle (1 degree at 84 in.) in one instance, 
and the larger angle (3 degrees at 28 in.) in 
thg other. It is also apparent that the 1.5- 
in. interval is optimum at all three viewing 
distances, 

The data from Table 2 are presented 
Staphically in Fig. 2. 

Parts A and B of Fig. 2 show the plots for 
Comparable interval lengths and visual an- 
gles, respectively, at the three viewing dis- 
tances. The difference in the slope of the 
lines representing interval length in Fig. 2 

reflects the fact that interpolation to 
tenths of the 0.5-in. interval becomes more 
difficult as viewing distance is increased. The 


J 


VIEWING DISTANCE 


T 


PERCENTAGE OF INTERPOLATIONS IN ERROR 


os Js 
SCALE INTERVAL LENGTH 
LUNCHES) 


a 2. Scale interpolation errors 
fiery itetval length, and (B) visual 
Wing distances. 


10 
VISUAL ANGLE (DEGREES) 


as a function of 
1 angle, at three 


difference in the slope of the lines represent- 
ing visual angle in Fig. 2 (B) indicates the 
absence of angular constancy. 


Discussion 


The recommendation (U. S. Armed Forces, 
1950), based on the assumption that the 
“Jaw” of the visual angle applies to situa- 
tions involving visual displays, implies that 
the specification of display size in terms of 
visual angle is synonymous with the specifi- 
cation of display size in actual dimensions 
and viewing distance. 

The results of the present study suggest 
that these two modes of specifying the stimu- 
lus variable of size are not synonymous. 
“Actual dimensions” appears to be a more 
crucial factor than “visual angle.” 

In the classical size-constancy experiment 
the observer is required to “size-match” 
standard and comparison stimuli which are 
presented simultaneously at different dis- 
tances from the observer. The task involved 
in the present study—visual interpolation— 
does not require “gize-matching” and is thus 
an unusual approach to the size-constancy 
problem. 

It is evident from the results reported here 
that the kind of constancy demonstrated by 
the classical size-constancy experiment is not 
ed to situations involving the “size- 
matching” procedure. The fact that viewing 
distance has no effect on the optimal interval 
length signifies the existence of a constancy 
effect where size, per Se, is not the dimension 


being judged. 


confin 


Summary and Conclusions 


s were conducted to estab- 
ngth of interval for visual 
hs and to determine the 
ce on the optimal in- 


Two experiment 
lish the optimal le 
interpolation in tent 
effect of viewing distan 
terval length. Results of Experiment 1 show 
that an interval length of 1.0-1.5 in. generates 
m number of errors of interpolation 
at the three viewing distances (Fig. 1). Re- 
sults of Experiment 2 show an optimal inter- 
val length of 1.5 in. which is not affected by 
distance from 28 to 56 


a minimu 


changing the viewing 
or 84 inches (Fig. 2). 


128 A. V. Churchill 


From these results it is concluded that the 
“law” of the visual angle does not apply un- 
der the conditions tested. It is suggested that 
display dimensions and viewing distance be 
stated when specifying display size, rather 
than combining these dimensions and speci- 
fying display size in terms of visual angle. 


Received July 18, 1958. 


References 


Churchill, A. V. The effect of scale interval length 
and pointer clearance on speed and accuracy of 
interpolation. J. appl. Psychol., 40, 6, 1956, 358- 
361. 

Quenouille, M. H. Introductory statistics. 
Butterworth-Springer, 1950. 

U. S. Armed Forces, NRC Vision Committee. Stand- 
ards to be employed in research on visual displays- 
Washington, D. C., 1950. 


London: 


> eee Ee =: ———e 


Journal of Applied Psy 
Vol. 43, No. vA foi peeves 


COMPARABILITY OF WONDERLIC TEST FORMS IN 
INDUSTRIAL TESTING 


LEONARD J. KAZMIER?* 
Ohio State University 


ano C. G. BROWNE 


Wayne State University 


The Wonderlic Personnel Test, a short 
group test of mental ability designed espe- 
Cially for industrial testing use, is available 
in five alternate forms, A, B, D, E and F, 
each of which includes 50 test items and in- 
bl 12 minutes of testing time. Forms 
D, E and F were developed by utilizing test 
items from the Otis Self-Administering Test 
of Mental Ability—Higher Form, while forms 
> and B were developed independently at a 
ater time by Wonderlic. 

Wonderlic (1945) refers to the five forms 
of the Personnel Test as being equal and 
Similar. Accordingly, the published norms 
are not differentiated by form, but are con- 
Sidered applicable to any form. However, if 
it were found that any of the forms are sig- 
nificantly easier or more difficult than the 
Others, using the same norms would affect 
the selection process. Hay (1952) reported 
a study in which 400 young women appli- 
cants for clerical positions in a large organi- 
zation were given both Forms D and F of 
the Personnel Test. He found that Form 

was significantly easier than Form D at 
the 1% level of confidence. Weaver and 
Boneau (1956) reported a study concerning 
the comparability of all five forms of the 
Pi ersonnel Test in an academic testing situa- 
tion involving 70 Ss. Of the 10 differences 
between the means of pairs of test forms, 
Nine significant differences were reported: two 
at the 2% level, three at the 1% level, and 
four at the .1% level. Only forms D and E 
did not differ significantly from each other. 

The present study was initiated in order 
to investigate the comparability of all five 
forms of the Personnel Test in an industrial 
testing situation, the kind of situation for 
Which the test was especially designed. 
—— 

1 Formerly at Wayne State University. 


Sample and Procedure 


The Ss were 590 male applicants for an industrial 
apprenticeship program involving such trades as tool 
making, die making and plumbing pipe fitting in a 
large manufacturing company. The formal educa- 
tion of the applicants ranged from completion of the 
eighth grade to completion of college, with a mean 
of 11.77 years. Their ages ranged from 17 to 38 
years, with a mean of 21.81 years. 

The Ss were tested in 16 sessions, so that about 
37 Ss were tested in each session. The seating of 
the Ss in the examination room was by their own 
choice. All five forms of the Personnel Test were 
administered in all sessions in such a way that every 
fifth man took Form A, every fifth man took Form 
B, and so forth for Forms D, E and F. Thus, 118 
Ss took each form. It is assumed that the sys- 
tematic distribution of test forms resulted in a high 
e of randomization of the Ss in terms of the 


degre 
In addition, this sys- 


abilities being measured. 
tematic manner of distribution avoided having the 


same test form administered to those sitting next 
to each other. Before the test began, the instruc- 
tions given on the first page of the test were read 


aloud and time was allowed to answer the sample 


questions on that page. The usual 12-minute time 
limit was used. 

Wonderlic (1945) has suggested that certain num- 
bers of points be added to the scores of those Ss 
who are 30 years of age and over, the number of 
points varying by age group. Twenty-nine of the 
590 Ss in this experiment were between 30 and 38 
years of age, so three points were added to each of 
their scores, as suggested in the test manual. How- 
ever, the mean score for each of the five forms was 
computed, both before and after the corrections for 
age were made. The over-all significance of the 
observed differences among both mean uncorrected 
and corrected scores was measured through calcula- 
tions of F ratios. The Duncan Multiple Range Test 
(Duncan, 1955) was then used to test all differences, 
taken two at a time, at the 5% level of significance. 

In his table of norms, Wonderlic (1945) also has 
indicated that there is a positive relationship be- 
tween years of education and test scores achieved. 
Since it was assumed that a high degree of randomi- 
zation of the Ss was achieved through the system- 
atic distribution of test forms, only chance differ- 
ences in educational level should exist among the 
Ss taking the five forms. Therefore, the mean years 


129 


130 
Table 1 
Mean Scores and Variances by Personnel Test Form 
(N = 118) 

Form A B D E F 
Mean 20.62 16.79 19.47 19.66 21.13 
S 51.55 39.64 34.28 45.29 35.68 
Corrected Mean* 20.77 16.87 19.55 19.84 21.38 
Corrected $+ 55.54 42.46 37.51 48.50 45.82 


2 Scores corrected for age. 


of education by test form were computed, and the 
significance of differences among those taking the 
five forms was determined by calculating the F ratio. 

Finally, the numerical differences among the un- 
corrected mean scores of the five forms found in the 
present study were compared with the differences 
reported by Hay (1952) and by Weaver and Boneau 
(1956). The purpose of this was to observe if the 
direction of the interform differences tended to be 
the same, that is, if the same forms were consistently 
higher or lower in comparison with the other forms. 


Results 


Wonderlic (1945) suggests that Forms A 
and B or Forms D and F be paired when two 
alternate forms of the Personnel Test are 
used. In the present study, neither of these 
pairs of forms were found to be mutually 
comparable when the scores were corrected 
for age. The mean uncorrected scores for the 
five forms ranged from 16.79 for Form B to 
21.13 for Form F, a total range of 4.34 score 
Points, while the mean corrected scores ranged 
from 16.87 for Form B to 21.38 for Form F, 
a total range of 4.51 score points. The means 
and variances for all of the forms, both un- 


Leonard J. Kazmier and C. G. Browne 


corrected and corrected for age, are presented 
in Table 1. The analyses of variance, in- 
cluded in Table 2, indicate that the observed 
differences among the means are significant 
at the 1% level of confidence whether or not 
the scores are corrected for those 30 years 
of age or older. 

The 10 possible differences between the 
means of both uncorrected and corrected 
scores are presented in Table 3. The number of 
score point differences between pairs of mean 
uncorrected scores ranged from .19 between 
Forms D and E to 4.34 between Forms B 
and F, while the differences between pairs of 
mean corrected scores ranged from .29 be- 
tween Form D and E to 4.51 between Forms 
B and F. In almost every case, correcting 
the scores for those 30 years of age or one 
had the effect of increasing obtained ge 
ences between the means of pairs of tes 
forms. In only one case was the difference 
reduced, that being the difference betwee? 
the mean scores of Forms A and E, which was 
reduced from .96 to .93 score points by apply 
ing the correction. On the basis of the Du? 
can Multiple Range Test, the mean score f0" 
Form B differed from all of the other forms 
of the Personnel Test at the 1% level of FA 
nificance whether or not corrections are ma 
for age factors. In addition, Forms D and 
differed from each other at the 5% level a 
significance when Corrections for age pal 
made. It had been anticipated that sinc? 
Forms A and B are the most recently om 
structed forms of the test, there would be les 
difference between these two forms than be 


Table 2 
. Analyses of Variance for Difference Among Scores on Personnel Test Forms Beer 
m Sum of A 
Source of Variance Squares df aed F P 

Between Forms 1,355.8 4 339.0 8.35 <.01 

Within Forms 23,754.5 585 40.6 

Total 25,110.3 589 

Between Forms Corrected* 1,418.8 4 354.7 8.80 <01 

Within Forms Corrected* 23,589.7 585 40.3 

Total 25,008.5 589 


a Scores corrected for age. 


ee 


Comparability of Wonderlic Test Forms 131 


Table 3 


Differences Between Mean Scores on Personnel 
Test Forms 


(N = 118)" 
Form A B D E F 
A xo 3g” iis 96 51 
3.90"* 1.22 93 61 
B xX 2.68%" -2.87** | 4.34" 
208% 2.97% 451* 
D x 19 1.66 
29 1835 
E = 1.47 
1.54 
F x 
a Italics indicate differences between means of scores Cor- 


rected for age. 

Significant at the 5% level on the 
Multiple Range Test. oie 
Significant at the 1% level on the basis of the Duncan 
Multiple Range Test. 


basis of the Duncan 


tween other pairs of test forms. As indicated 
in Table 3, this was not the case. The differ- 
ence between the uncorrected mean scores for 
Forms A and B was 3.83 score points, while 
the corrected scores differed by 3.90 score 
Points. These differences were exceeded only 
by the differences between Forms B and F. 
That these differences among test forms 
cannot be ascribed to a chance variation 
among the Ss in regard to their educational 
level was demonstrated by testing the signifi- 
cance of observed differences in educational 
level for those taking the different forms of 
the test, The mean years of education by 


Table 5 


Analysis of Variance Table for Differences in Years 
of Education by Personnel Test Form 


Sum of Mean 


Source of 
Squares df Square F P 


Variance 


Between Forms Bit 4 142 121 >.05 
Within Forms 681.0 585 1.17 


Total 686.7 589 


for all of the forms are summarized in Table 
4. The F ratio for the observed differences 
is not significant at the 5% level being tested, 
as indicated in Table 5. 

The mean scores found in the present study 
are considerably lower than those reported by 
Hay (1952) and by Weaver and Boneau 
(1956), as indicated in Fig. 1. This, of 
course, is a function of the different types of 
samples involved in each study. However, 
the order and magnitude of differences among 


coros 


Hoan 


Ahy 


test form ranged from 11.74 for Form A to 5 P 
11.80 for Form D, a total of just .06 years. 4 [e 
he variances ranged from 80 for Form F : ——— Preeont. Study 
to 1.85 for Form A. The mean number a sa 
Years of education and the variances involve 
Ë B D E r 
Table 4 Wonderlic Forn 
Mean Years of Education and Variances by Fic. 1. Comparison of the mean scores obtained 
Personnel Test Form by Hay and by Weaver and Boneau with those ob- 
(N = 118) tained in the present study. 
= zr p= ği 2 Hartley’s Test (David, 1952) employed at the 
Form A B D E F 5% level, indicates that heterogeneity of variance 
exists. However, since the effect of the heteroge- 
Mean 1174 11.77 11.80 11.76 11.76  neity is to inflate the F that is obtained, no correc- 
S2 j 7 1.17 .82 80 tion is necessary in the present situation involving 
185 aa a nonsignificant F. 


132 


test forms that were found in these studies 
are consistent. In the studies involving all 
five forms of the Personnel Test, Form B had 
the lowest mean score, Forms D and E dif- 
fered only slightly from each other, and Form 
F had the highest mean score. A discrepancy 
exists, however, in the relative position of the 
Form A mean score in the two studies. The 
mean score for Form A was relatively higher 
in the present study than it was in the study 
by Weaver and Boneau (1956). On the 
basis of their study in an academic situation, 
Weaver and Boneau suggested that the forms 
fall roughly into two groups, Forms A and B 
comprising a group of greater difficulty and 
higher variability than Forms D, E and F. 
This hypothesis is not supported by the re- 
sults of the present study, which was con- 
ducted in an industrial testing situation. 


Summary and Conclusions 


Sixteen groups consisting of 590 male ap- 
plicants for apprenticeship programs in a 
large manufacturing company were tested 
using all five forms of the Wonderlic Person- 
nel Test (Forms A, B, D, E and F). Every 
fifth man took Form A, every fifth man took 
Form B, and so forth for Forms D, E, and F, 
so that 118 Ss took each form. For those Ss 
30 years of age or over, obtained scores were 
corrected by adding score points as suggested 
in the test manual. The mean scores by 
Wonderlic form were computed, and the sig- 
nificance of obtained differences among the 
forms was tested by calculating F ratios and 
by using the Duncan Multiple Range Test to 
examine all possible differences between pairs 
of means for both uncorrected and corrected 
scores. In order to ascertain that obtained 
differences were not due to a chance distribu- 
tion of Ss in regard to educational level, the 
significance of difference in years of education 
by test form was also tested b: 
the F ratio. 

It was found that Form B of the Personnel 
Test was more difficult than any of the other 
forms at the 1% level of significance, whether 


y calculating 


Leonard J. Kazmier and C. G. Browne 


or not score corrections for age are made, and 
that Forms D and F differed from each other 
at the 5% level of significance when such se 
rections are made. These differences coul 

not be ascribed to differences in the educa- 
tional level by test form, which were not sig- 
nificant at the 5% level tested. The direction 
and magnitude of differences among the forms 
were found to be similar to differences E 
ported in previous studies of the Personne 
Test. 

On the basis of this study, it is veii 
mended that Form B of the Personnel Tes 
not be regarded as directly equivalent to a 
of the other four forms of the test and me 
Form D not be regarded as directly equivalent 
to Form F in industrial testing situations 
similar to the one in the present study. Tuon 
findings are particularly pertinent, since m 
derlic suggests that when two forms of a 
test are to be used, the best combinations 4 i 
A and B or D and F. Neither of these o 
were found to be mutually comparable whe 
the suggested scoring procedure was followe® 


Received July 21, 1958. 


References 


wie 
David, H. A. Upper 5 and 1% points of the ke 
mum F ratio, Biometrika, 1952, 39, ee 
Duncan, D. B. Multiple range and multi 
tests. Biometrics, 1955, 11, 1-42. won- 
Hay, E. N. Some research findings with the 2, 36: 
derlic Personnel Test, J, appl. Psychol., 1952s 
344-345, naly- 
Hovland, C. I., & Wonderlic, E. F. A critical “ental 
sis of the Otis Self-Administering Test of 1939 
Ability—Higher Form. J. appl. Psychols 
23, 367-387. sof 
Kazmier, L. J. Comparability of the five Orr an 
the Wonderlic Personnel Test and its use 4° igl 
anchor test in equating two other tests of Motate 
ability. Unpublished Master’s thesis, Way?¢ 
Univer., 1958. forms 
Weaver H. B., & Boneau, C. A, Equivalence of re- 
of the Wonderlic Personnel Test: A study ° pol 
liability and interchangeability. J. appl. Psy 
1956, 40, 127-129, nual. 
Wonderlic, E. F. Wonderlic Personnel Test M’ 
Northfield, Illinois: Author, 1945. 


Journal of Applied Psy 
Vol. 43, ka o 


OVER- AND UNDERACHIEVEMENT AND THE EDWARDS 
PERSONAL PREFERENCE SCHEDULE ` 


ROBERT E. KRUG 


Carnegie Institute of Technology 


Gebhart and Hoyt (1958) have presented 
data showing that several scales of the Ed- 
wards Personal Preference Schedule (EPPS) 
discriminate between over- and underachievers 
T two Schools at Kansas State College. 

hese results were of considerable interest to 
the writer, since in an unpublished study of 
apparently identical design, he had found no 
significant differences. In an effort to under- 
stand the discrepant findings, one difference 
in procedure became apparent. At Carnegie 
pa optimal prediction of academic per- 
aean in the College of Engineering and 

cience is obtained by an equation which 
employs three achievement tests? and high 
school standing as predictors. The equation 
Produces stable cross-validity coeficients in 
the .65 to .70 range. The best estimate of 
Performance, and hence the definition of a 
baseline for over- or underachievement is thus 
based on measures of past performance. In 
the Gebhart-Hoyt study, aptitude measures ë 
Were employed to define expected perform- 
ance. Tf it could be shown that this difference 
in procedure was responsible for the discrep- 
ancy in results, it would seem to offer addi- 
tional support for the EPPS as a valid device 
for the description of differential achievement, 
since it would indicate that the scales were 
reflecting personality characteristics associ- 
ated with a past as well as a future record 
Of over- or underachievement. 
“ee present study had two objectives. 
First, it was designed to replicate the Gebhart- 

Oyt study in regard to engineers and, as an 
€xtension, to test the difference between 
*ptitude-based and performance-based deter- 
Rinations of expected performance. 


i ance of Bar- 


The clerical and computational assist 
bara Woods is dratefully acknowledged. 
En, College Entrance Examination Board tests in 
edema, advanced math, and physics 
istry. 
n; Pre-Engineering Ability Test for the School of 
gineering and ihe ACE for the School of Arts 
Sciences, 


Procedure 


The samples were drawn from the population of 
411 freshmen who entered the College of Engineer- 
ing and Science in September, 1956. Two predic- 
tions of grade average were made for each student. 
The performance-based prediction employed the three 
achievement tests and high school standings; the 
aptitude-based prediction used the verbal and math 
scores from the College Board Scholastic Aptitude 
Test. Treating each set of predictions separately, 
students were categorized as low, average, Or high 
predicted. A student was assigned to the over- 
achievement group if his first-year grade average was 
above the predicted score, to the underachievement 
group if the average was below that predicted. A 
student thus might be an overachiever on one basis 
and an underachiever on the other. 

The performance-based sample was constructed by 
selecting from each of six groups (two levels of 
achievement and three levels of predicted achieve- 
ment), the 20 Ss whose obtained average was most 
discrepant from that predicted. The aptitude-based 


Table 1 


Mean First Year Grades (Predicted and Achieved) 
for the Two Samples 


(N = 20 per cell) 


Group Low Average High 
Aptitude: 

Over Predicted 1.66 1.97 2.56 

Achieved 2.53 2.99 3.50 

Under Predicted 1.68 2.04 2.46 

Achieved 1.02 1.13 1.49 
Performance: 

Over Predicted 1.46 1.98 2.65 

Achieved 2.40 2.80 3.36 

Under Predicted 1.57 2.00 2.68 

Achieved 1.03 1.27 1.81 
a 


4 Predicted averages of 1.80 and 2.20 were dividing 
points. The grading system assigns 4 points for an 
A, 3 for a B, 2 fora C,1 for a D, and 0 for an R. 
The mean grade average for the freshman class is 
2.00; the prediction equations give an jdentical mean 
for the distribution of predicted grades. 


133 


134 Robert E. Krug 


Table 2 
EPPS Scores for Groups of Over- and Underachievers 


Aptitude-base 


Performance-base 


Inder Over Under 
wee) eS) (W= 0) w =o) 
Scale M o M g F Wo Mo o á 

s 
Achievement 17.9 3.2 155 41 13.09¢ 17.5 3.5 16.1 4.6 | 
Deference 120 34 123 43 — 11.2 31 12.6 45 i 
Order 12.7 4.0 10.8 4.8 5.648 12.7 40 11.9 5.0 p 
Exhibition 14.3 3.4 14.6 34 — 14.0 3.0 14.2 3.6 = 
Autonomy 14.3 3.8 146 4.1 = 14.6 41 15.1 4&1 = 
Succorance 9.9 4.5 10.2 4.2 — 9.7 45 10.0 48 1.64 
Affiliation 13.0 3.5 152 4.3 9.31% 13.3 3.9 14.3 43 os 
Intraception 16.2 4.5 148 5.9 2.13 153 S1 14.9 60 242 
Dominance 16.2 4.2 16.0 4.5 — 164 4.2 15.1 5.1 = 
Abasement 14.0 43 129 48 1.80 13.3 4.2 12.9 4.8 ee 
Nurturance 12.6 3.9 13.1 4.8 — 11.9 38 127 aT Se 
Change 14.6 4.7 16.1 43 3.55 15.4 46 16.0 43 113 
Endurance 18.0 4.7 153 S37 8.06 17.4 48 16.4 5.8 a 
Heterosexuality 13.9 6.0 16.3 6.2 4,539 15.4 65 15.2 7.3 12 
Aggression 11.2 3.8 13.2 49 3.25 12.7 4.6 13.6 43 y 

8p <.05. bp <01. ep <.001. =F < 1.0; 


sample was formed by an identical procedure, using 
the aptitude-based prediction to define the groups, 
Table 1 presents the average predicted grade and 


: : tw! en 
a fact which reflects the positive correlation bet 


the aptitude and achievement tests used as Pre 


, Scores for the 15 EPPS scales were collecto, g 
the average achieved grade for each subgroup. all Ss, The inventory had been completed ssis of 
Seventy-three Ss were common to the two samples, the freshman orientation program. An goi 

Table 3 
EPPS Scores for Groups at Three Levels of Predicted Ability: Aptitude-based Sample Á 
- — 
Low Average High 
Scale M o M o M o e 
Achievement 159 3.8 171 3.8 17.2 3.9 1.57 
Deference BS 3.4 120 3.9 111 3.9 3.71" 
Order 129 4.4 12.1 48 10.4 44 3.38" 
Exhibition 144 31 141 3.7 14.7 34 ~ 
Autonomy 145 40 143 39 14.5 40 E 
Succorance 10.7 33 99 49 96 47 < 
Affiliation 13.8 48 143 3.9 142 35 ia 
Intraception 148 55 15.8 5.4 15.8 5.0 ve 
Dominance 140 44 16.6 42 176 38 8.11" 
Abasement 151 44 13.6 51 16 38 6.41” 
Kantaa 134 47 13.5 47 118 35 1.84 
Change 15.3 5.4 15.8 4.2 15.0 3.9 oe 
Radurance 174 48 15.8 Si 16.7 61 e 
Heterosexuality 13.6 55 144 66 174 59 aam 
pe 14 44 121 44 B3 44 180 
ae < HS, b p< 01. ep < noi. =P <10. 


kai 


Edwards Personal Preference Schedule 


vaneg using a 2 X3 factorial design was applied 
o each of the 15 scales in each of the two samples. 


Results 


A Table 2 presents means and standard devia- 
tions for the groups of over- and underachiev- 
ers in each sample, as well as the F ratios for 
the differences between groups. 

A For the aptitude-based sample, the hypothe- 
sis of no difference between groups of over- 
and underachievers was rejected for five 
scales, Overachievers in a college of engineer- 
ing scored significantly higher on the Achieve- 
ment, Order, and Endurance scales, and 
significantly lower on Affiliation and Hetero- 
Sexuality. 

For the performance-based group, the null 
hypothesis was accepted for all scales except 
Achievement. 

Table 3 presents the means and standard 
deviations for groups at each level of pre- 
dicted achievement for the aptitude-based 
sample. In order to conserve space, the 
equivalent data for the performance-based 
group is not presented. It is true of the latter 
Sample that the null hypothesis may be ac- 
cepted for all scales. 

In Table 3, correlations between level of 
Predicted ability and five of the EPPS scales 
are observable for the aptitude-based sample. 
In addition to the tabled results, significant 
interactions (between groups and levels) were 
found for the Deference, Succorance, and 
Endurance scales in the aptitude-based group. 


Discussion 


One objective of the study was to replicate 
the Gebhart-Hoyt study. The results for the 
aptitude-based sample provide the relevant 
data for comparison. The agree 
the two studies is considerable. One measure 
of the agreement is the correlation between 
the Kansas State and Carnegie Tech engineer- 
ing samples.’ The 30 pairs of means (two 
achievement groups and 15 scales) correlate 
772. Furthermore, the hypothesis of Geb- 


ie 
5 In the Gebhart-Hoyt study, means and standard 
deviations were presented for two colleges combined. 
Hoyt kindly provided the writer with the data for 
eir engineering sample alone, thus enabling the 
Comparison which is reported here. 


135 


hart and Hoyt that there are several patterns 
of over- and underachievement is clearly sup- 
ported. In the present study, we may char- 
acterize the overachieving engineer as one 
having a strong need to achieve, or a need 
to keep things orderly, or a need to endure 
in a task, and these traits are not correlated 
within the sample of overachievers. The ob- 
tained correlations of —.05, .17, and .16 
clearly indicate that different individuals are 
contributing the high scores to the different 
scales. The same situation prevails in regard 
to the Affiliation and Heterosexuality scales 
on which underachievers earn high scores. 
The correlation between the scales is —.14 
within the sample of underachievers. 

The second purpose of the study was to 
contrast two bases for determining the ex- 
pected performance of the college freshman. 
Despite the fact that a majority of the Ss 
are in both groups, the results seem quite 
clear. If we base our estimate on measures 
of aptitude, the EPPS makes a significant 
contribution toward reducing the residual 
variance (over- and underachievement). If 
we base our estimate on records of past per- 
formance, this contribution tends to wash out. 
It might be argued that the results simply 
reflect the fact that the performance-based 
estimate is the more valid and hence less 
reliable variance is available for the person- 
ality scales to explain. While the fact of a 
less reliable residual is indisputable, the mat- 
ter is of little concern. What seems impor- 
tant is the demonstration that two measures 
are functionally equivalent. The variance 
which is explained by past performance, but 
not by ability, may also be explained by cer- 
tain scales of the EPPS. These scales reflect 
a difference between capacity and perform- 
ance which is useful for predictive purposes, 
but which is also available from records of 
past performance. Since a theory of over- 
or underachievement is interested in account- 
ing for behavior which ability measures do 
not predict, the findings of Gebhart and Hoyt 
replicated in the present study seem of real 
value. 

Tf one is interested only in the problem of 
prediction, the past performance measures will 
often be preferred to a personality inventory 


on grounds of practicality. 


136 Robert E. Krug 


Summary and Conclusions 


The major objectives of the study were 
(a) to investigate the relationship between 
EPPS scale scores and over- and under- 
achievement in a college of engineering and 
(b) to examine two bases (aptitude tests vs. 
achievement records) for determining an ex- 
pected level of performance. 

Two samples, each consisting of 120 Ss were 
selected; the samples were termed aptitude- 
based and performance-based. Seventy-three 
Ss were in both samples. In each sample 
there were 20 Ss at each of three levels of 
expected performance for both over- and 
underachievement groups. In regard to the 
first stated objective, analyses of variance of 
the aptitude-based sample permit the follow- 
ing conclusions: 


1. Overachievers scored significantly higher 
on the Achievement, Order, and Endurance 
scales, and significantly lower on Affiliation 
and Heterosexuality, 

2. These scales are Statistically independent 
within the relevant samples, indicating that 
several patterns of over- and underachieve- 
ment are present, 

3. High ability Ss score significantly higher 
than low ability Ss on Dominance and Hetero- 
sexuality and significantly lower on Deferen 
Order and Abasement. 

4. Significant interactions between ability 
level and over- and underachievement were 


Present for the Deference, Succorance, and 
Endurance scales, 


ce, 


In regard to the second objective, the sepa- 
rate analyses of performance-based and apti- 
tude-based groups lead to the following 
conclusions: 


1. When over- and underachievement are 
taken as departure from a regression a 
based on achievement tests and high schoo 
record, only the Achievement scale of the 
EPPS discriminates between the two gan 
In addition, the correlations between scale 
score and ability level disappear, and there 
are no significant ability-achievement inter 
actions. E 

2. The variance which EPPS scores roti F 
for is the same variance that is explained Ri 
an S’s past record of ability-performan 
differential. g nt 

3. Theories of over- and underachieveme™ 
may start with the personality descr 
Ss who deviate from an aptitude-based regt 
Sion line. Certain of the EPPS scales provi 
labels descriptive of this behavior. nd 

4. For purposes of selection, the EPPS 4 48 
certain evidences of past performance 4 
functionally equivalent. 


Received July 25, 1958. 


References 


„al 
Edwards, A, L. Manual for the Edwards Pere 
Preference Schedule, New York: Psycholog 
Corp., 1954, n 
Gebhart, G. G., & Hoyt, D. P, Personality p 
of under- and overachieving freshmen. J. 
Psychol., 1958, 42, 125-128. 


eeds 


d 
ppl 


Journal of Applied Psy 
Vol. 43, Noe asym 


QUANTIFICATION OF THE TERM “OBJECTIONABLE” 
AS APPLIED TO COLORANTS IN NATURAL 
WATERWAYS: 


JOHN H. WAKELEY °? 
North Carolina State College 


in fei often exercise sufficient control 
a ae of their wastes so that no noxious 
“i oxic substances are added to natural 
tak tig However, even when such care is 
tidas natural waters may be discolored by 
sive Ba wastes. The discoloration is offen- 
the o individuals with riparian rights on 
sr cn and also to those who wish to 
dusi recreational uses of the waterways. In- 
the i which discharge colorants may incur 
> ill-feelings or the lawsuits of people in 
ontact with the streams. 

ae regulatory agencies have attempted 
a agin standards with regard to color- 

S in streams, These standards are fre- 
api arbitrary and difficult for an industry 
Tne Meet. For example, the New England 
‘tp erstate Water Pollution Control Commis- 

n stated that the maximum colorant to be 
Permitted in streams “will be amounts that 
ate not objectionable” (Feller & Newman, 
1951, p. 3). The term “objectionable” is 
Mappropriate as a standard since it is not 
Specified in terms of objectionable to whom 
Under what conditions, nor is it related to 
any quantitatively measurable scale. 
t The study reported below was undertaken 
© determine a method for giving quantitative 
Meaning to the qualitative term “objection- 
able” as it applies to colored wastes in 
Streams, 

To establish a method for quantifyin 1 
term “objectionable,” there were three condi- 
tions to be fulfilled. One, there had to be a 
Way of measuring color; two, there had to 

€ a way of relating different colors to a 
Measurable scale; and three, there had to be 
a device for simulating a natural stream $0 
hat qualitative judgments of objectionable 


g the 


T 
Naati research, directed by Harold M. geen 


son L., Nemerow, was sponsored by the 
Tastitute of Health.’ P 
ow at Michigan State University. 


137 


colors could be obtained under laboratory 
conditions. 

A Photovolt Photoelectric Reflection Meter, 
Model 610 with a 610-D search unit, modified 
by Coss (Coss & Nemerow, 1958) for use in 
measuring the color of liquids, was employed 
to measure colors. Hunter’s (1942) tri- 
stimulus color filters were used in the instru- 
ment: and colors were defined in terms of 
dominant wave length, luminance, and purity 
in accordance with the procedure adopted by 
the International Commission on Illumina- 
tion (Judd, 1933). Using this system of 
measuring color, any color can be given a 
location in color space. 

The method for relating different colors 
employed the Hunter-Scofield color-difference 
formula (Hunter, 1942). This formula pro- 
vides a means for measuring the distance be- 
tween two colors in color space in terms of 
the National Bureau of Standards’ unit of 
color-difference (Hunter, 1942, p. 519). The 
formula was experimentally derived and is 
expressly intended for use with the Hunter 
tristimulus color filters. A complete descrip- 
tion of the formula and examples of its use 
can be found in Hunter (1942) and Wakeley 
(1958, p. 48). ; 

Qualitative judgments were obtained by us- 
ing the Streamviewer to present the water 
from a natural stream (the color of which had 
been measured) to an S. The Streamviewer 
was essentially a box constructed of one-half 


inch plywood with dimensions of 24 inches 


from front to back, 25 inches from side to 


side, and 20 inches from top to bottom. The 
box had a viewing slot which permitted one 
person at a time to view the interior. A trough 
was fitted to the bottom of the box at the 
rear and in line of sight for a person looking 
through the slot. A circulating pump was 
fitted to the trough to maintain a flow of 
water in the trough. The interior of the box 


138 


was designed to represent a rural, summer 
scene from the Drowning Creek region near 
Hoffman, North Carolina. The rear wall of 
the interior was painted to represent the re- 
gion mentioned. The bottom was painted to 
represent grass. A daylight-type fluorescent 
lamp was used to light the interior. When 
appropriately colored water was put into the 
trough, the apparatus in operation simulated 
a natural stream in its natural setting. A com- 
plete description of the Streamviewer may be 
found in Wakeley (1958, p. 52). 


Procedure 


The procedure for obtaining qualitative judgments 
consisted of having an S look into the Streamviewer 
and observe while the natural color of the water 
was gradually changed by the addition of a colorant. 
When an S objected, a sample of the objectionable 
water was removed for color measurement. The 
trough of the Streamviewer was again filled with 
the natural water, and the S$ again observed while 
a different colorant was added until he objected, 
This process was repeated with each of six colorants, 
Twenty Ss followed the same procedure. 

Water used in the trough of the Streamviewer 
was matched to a sample of water taken from 
Drowning Creek near Hoffman, North Carolina, 
during a period of average summer flow and average 
color conditions. Records of the Geological Survey, 
United States Department of the Interior, Raleigh, 


North Carolina, were used to establish the average 
values mentioned above, 


land Counties, North Carolina, 
Were near the site where the s 
This sample was used because it 
which has contact with the stre 
tion, because it Wi 


since these counties 
tream was sampled. 


that after the age 


<. 1e age Of 14 years, males evidence better 
color discriminatio 


aximum at about 
slowly, 


2 ange, a yellow, a 
and a violet—were used as the 
colorants. 


Results and Discussion 


The data collected consisted of the 120 
colors found objectionable, i.e., each of 20 
Ss objected to a certain color resulting from 
the addition of each of six different colorants. 


John H. Wakeley 


The colors found objectionable were compared 
to the color of the natural stream by means 
of the Hunter-Scofield color-difference for- 
mula. The twenty color-difference scores for 
each colorant added were tested for skewness 
and kurtosis, and it was determined that n 
distribution of color-difference values wth 
either significantly kurtotic or significantly 
skewed. vart 
Following the tests for skewness and ae 
sis, the normal distribution curve was adop a 
as the model for obtaining a color-differet 
value for each colorant which was ee 
able to fewer than 5% of the population © ce 
resented by the sample. This color-differen’® 
value, subsequently called the 5% or i by 
jectionable Point) score, was compute 
the formula 5% OP = mean score — 5% 
standard deviation. Table 1 shows the the 
OP scores obtained. Figure 1 presents he 
relationship of the 5% OP scores tO 
original color of the stream. nge 
Figure 1 shows the area of color en 
about the original color of the stream ies 
objectionable to less than 5% of the Pop iso 
tion represented by the sample used an ame 
the area objectionable to 50% of the 3 six 
Population. The points on each of re lot 
lines radiating from the center (original ques 
of the stream) were color-difference eel? 
obtained in this study. The lines which 
nect the points and which define el 
areas are estimates of the values which Wh, 
be obtained if different colorants were US° for 
If a color standard were to be Sê this 
Drowning Creek from the results © An 
study, the standard might be as follows: 


uld 


Table 1 


Color-Difference Values Obtained for Each 


Distribution of Colorants m 
5% OP Mean 
Colorant Score Score sp 
40 
Red 1434 5295 Keg 
Orange 61.22 88.74 1a 
Yellow 21.98 73.69 Say 
Green 40.09 8360 265) 
Blue 80.56 132.24 3 wi 
Violet 31.83 75.50 20; 
r urit? 


T, z 
Note.—All values are in National Bureau of Standa” 
of color difference, 


Colorants in Natural Waterways 


Fis. 1, Relationship of 5% OP scores and mean 
Scores to the original color of the sample stream. 


industry which discharges colorants into 

rowning Creek must exercise sufficient treat- 
Ment of waste and care in discharge policy 
So that a color difference between the stand- 
ard color of the stream (as determined by 
4 Suitable agency) and the color of the stream 
after an effluent is introduced is never greater 
than the values which make up the 57 OP 

Oundary about the standard color. 

The values obtained in this study apply 
only to the population sampled. To employ 
this method for setting standards for color 
Pollution, it would be necessary to draw a 
Sample of observers from the population which 

ad contact with the certain stream for which 
à standard was desired. , 

The method of this study, namely, having 


individuals in contact with the particular 


Stream give judgments as to when the color 


of a stream becomes objectionable, quantify- 
mg judgments in terms of a color-difference 
Score, relating the quantified judgments to 

© normal curve, and determining certain 
Percentages of the population which object 
ae certain color change, is limited. With 

is method each stream presents a unique 
Problem, Further investigation directed to- 
l ard making the method applicable to areas 
arger than a specific stream is needed. 


139 


Investigations directed toward making the 
method more widely applicable should con- 
sider the following questions. What is the 
relationship among variables such as original 
color of stream, frame of reference of the 
viewers (sportsmen, tourists, farmers, etc.), 
and hue of colorant? How do the variables 
mentioned above influence the size and shape 
of the 5% OP area as shown in Fig. 1? Is 
there a difference in tolerance between those 
who perceive the colorants as esthetically un- 
pleasing and those who perceive the colorants 
as interfering with the usefulness of the 
stream? What relation exists between judg- 
ments using the Streamviewer and judgments 
under natural conditions? ° 


Summary and Conclusions 


A study was conducted to provide a basis 
for determining a method for relating judg- 
ments of the objectionableness of colorants 
in natural waters to a measurable scale of 
color difference. 

Twenty Ss observed a simulated natural 
stream as it was gradually changed in color 
by the addition of each of six different color- 
ants. Every S indicated when the color of 
the stream became objectionable for each of 
the colorants. A color-difference formula was 
used to determine how greatly the objection- 
able colors differed from the original color of 
the stream. The distribution of scores for 
each of the colorants was examined and found 
to be distributed in an essentially normal 
manner. Using the normal curve, 5% OP 
scores were determined for each of the sepa- 
rate colorants. A 5% OP score represented 
a color-difference between the original stream 
color and the color, resulting from the addi- 
tion of a certain colorant, which was objec- 
tionable to fewer than 5% of the population 
represented by the sample used. 

The major conclusion of this study was 
that the term “objectionable” as it applies to 
colored wastes in streams can be quantified. 
This conclusion was based on two specific 
findings: 

1. The point at which the color of a stream 
becomes objectionable as a result of the addi- 

3 This question is currently being investigated by 
the Department of Civil Engineering at North Caro- 
lina State College. 


140 


tion of a colorant was expressed in terms of 
a number. This number represented the dif- 
ference between the original color of the 
stream and the certain color objected to. 

2. The color-difference scores for the addi- 
tion of any certain colorant were found to be 
distributed normally in the sample used in 
this study. 


An inference was made that the normal 
curve can be employed to determine color 
differences which will be objectionable to cer- 
tain percentages of a population which is in 
contact with a particular stream. 


Received July 28, 1958. 


John H. Wakeley 


References 


Coss, J. C., & Nemerow, N. L. Color measurements 
with the Stream Colorimeter. Sewage and Indus- 
trial Wastes, 1958, 30, 804-811. 

Feller, G., & Newman, Janice. Industrial waste treat- 
ment. Industry and Power, 1951 (June). n 

Hunter, R. S. Photoelectric tristimulus colorime’? 
with three filters. J, opt. Soc. Amer, 1942, 32 
509-538. nd 

Judd, D. B. The 1931 I.C.I. standard observer # 6 
coordinate system for colorimetry. J. opt- Soe 
Amer., 1933, 23, 359-374. + ation, 

Smith, H. C. Age difference in color discriminatio™ 
J. gen. Psychol., 1943, 29, 191-226. Jors 

Wakeley, J. H. Measurement of objectionable colors 
resulting from industrial wastes in streams. State 
published master’s thesis, North Carolina 
College, 1958, 


Journal of A i 
Vol. 43, Noe a 


A STUDY OF ENGINEERS’ CRITERIA FOR CREATIVITY * 


THOMAS B. SPRECHER ° 


University of Maryland 


toe common belief that the creative 
a is vital to our national welfare and 
eer well-being of individual industries. 
in fe years have seen a considerable increase 
oe number of studies concerned with the 
eae of creativity in engineers and 
eae Not at all uniquely, criterion de- 
Peo rae isa crucial problem, although fre- 
ed it is assumed that creativity is a 
ae a that can be defined, given to a set 
knna and used by them. To the author’s 
proce A ge, no previously reported study has 
the nt by allowing judges to read into 
chos erm creativity whatever meaning they 
ae and then attempting to specify this 
diffe ing. This study asserts the premise that 
en among expert judges regarding the 
Saria of the term creativity is meaningful 
a ility indicative of the kinds of ideas 
X i. If creativity means different things 
a $ erent people, this variability should be 
plored before attempts to define it are 
undertaken. 
: = order to describe the creative person in 
fi pennies) setting, engineers in a large indus- 
io organization producing aircraft equip- 
ass were questioned. This study reports on 
tio se things engineers and supervisors men- 
Si ned when asked to tell why they felt that 
ea engineers and some solutions to engi- 
Sie problems were more creative than 
Y ers. Estimates of the predictability of 
te S criteria used in the study will also be 
t Ported. Although exception may be taken 
© the use of engineers to describe creative 
pyle per se, these men were well acquainted 
With each other and the requirements of their 
Jobs demanded new ideas to meet new and 


T i *, 
sity Submitted to the graduate school of the Univer- 
medic Maryland in partial fulfillment of the require- 
States for the Ph.D. degree. The author is very 
Vidus ul for the assistance of the following indi- 

Peete John W. Gustad, University Counseling 
land ae University of Maryland, College Park, Mary- 
of Pi and Ray C. Hackman, ‘Psychological Service 

putsburgh. 
cal geescntly Consulting Psychologist a 
ervice of Pittsburgh. 


t Psychologi- 


141 


complex situations. Whether or not we grant 
the ability of engineers concerned with the 
production of new equipment to describe the 
creative person, we will report here what such 
engineers think creativity means. Important 
practical considerations require that their 
opinions be considered carefully. 


Method 


Sampling. One hundred and seven engineers were 
drawn at random from a population of work groups. 
The basic decision affecting the sampling plan was 
the decision to attempt to get ratings on each man 
in the study from his peers as well as from his su- 
pervisor(s). The sampling plan required that the 
men in these work groups be sufficiently well ac- 
quainted with each other so that the men could rate 
each other. A further restriction was that there- be 
at least 15 men in each of the groups so that 12 men 
could be available without being restricted by unex- 
pected travel, sickness, etc. The Ss were males, and 
without exception were performing engineering work 
at the time of the study. All except one had as a 
minimum either a B.S. degree in engineering or its 


equivalent. 
vith all of the engineering depart- 


After contact v 
ments and the supervisors, & population of depart- 
four departments within 


ments consisting of at least 

each of the three areas of service, project, and re- 
search was developed. These three areas had been 
previously selected as representing three broad types 
of functions performed by the various departments 
within the company and which might serve as bases 
for differential assignment. Advance judgment in 
the company assumed this difference among work 
areas was important. Random sampling was used 
to select three departments out of the four avail- 
able within each of the areas, and within each de- 
partment random sampling was used to call in 12 
men from those available. The sample finally con- 
6 men from research groups, 36 from 


sisted of 3 
service groups, and 35 from project groups. 
Procedure. The testing itself was conducted in 


on and lasted approximately two hours. The 
men were given a variety of paper and pencil tests 
measuring various factorially defined abilities. They 
were also asked to solve three brief open-end engi- 
neering problems. These tests and the engineering 
problems will be described later. Immediately fol- 
lowing the administration of these psychological and 
engineering tests, each man was asked to rank in 
terms of creativity the other 11 men from his sec- 
tion who were also participating in the study. No 


one sessi 


142 


definition of creativity was given to him. Ties in 
the ranking procedure were allowed, and some of 
the men partially or completely refused to complete 
the rankings. The number refusing to participate 
varied considerably from department to department, 
but did definitely reduce the number of judges avail- 
able. After most had filled out the ranking form, 
they were asked to give reasons why they had 
chosen the top two men as more creative than the 
bottom two men. They were asked to make their 
comments specific and to give recent incidents sup- 
porting them. 

A ranking form similar to the foregoing was given 
to the supervisors of each of the nine groups par- 
ticipating in the study, and each supervisor was 
asked to rank the 12 men from his work area with 
regard to their creativity. Again, no definition of 
creativity was given. The supervisors also were 
asked to give reasons for the differences between the 
high ranked and low ranked men, 

A similar procedure involving both Peers and su- 
pervisors was used in judging the creativity of the 
answers to the engineering problems. The ranking 
procedure will be described in detail later, but after 
ranking sets of such answers, each man compared 
the sets of answers located at the two extremes and 
gave specific reasons why they differed in their 
creativity, 

Predictors and criteria. The following are the pa- 
per and pencil tests previously referred to which in 
general were used as predictors of the ratings of 


clever,3 

B. Synonyms—the numb 
common words, givin; 
Fluency, 

è Sign Changes—the number of simple arith- 
metic Operations carried out successfully when 
the ordinary meanings of the elementary sym- 
bols of Fee changed from Section to 
Section of the test and producin a 
Adaptive Flexibility, Pen 

D. Vocabulary—a high level multiple choice gen- 

eral vocabulary test that represented the fac- 
tor of Verbal Comprehension, 


er of synonyms given to 
& a score on Associative 


II. Scores on the engineering Problems: 
A. Shop problem 
B. Mass Flow problem 
C. Mechanization problem 


With the exception of the vocabulary test, the abil- 
ity tests were adapted from those devised by J: B, 


3 The correlation between these two scores was 
not significant at the .05 level. 


Thomas B. Sprecher 


Guilford (Guilford, Wilson, Christensen, & Lewis, 
1951; Guilford, Wilson, & Christensen, 1952) in = 
studies of higher level aptitudes, Since his tests ha 
been developed for the Air Force, it was not possible 
to use the identical tests, but versions closely similar 
to them were developed. . 

The engineering problems were problems techni- 
cally relevant to the work performed at the com- 
pany and were constructed by supervisors. Ten Hee 
Problems were developed, and three were used 7 
this study. These three were those which sce 
fully passed a Screening procedure in which 10, Gi 
gineers worked these problems under conditions sae 
lating those to be used in the study. These aunt 
tions involved 15-minute time limits, individual wor 
the problem stated as an open-end question, and oes 
included some consideration of the varicty of jety 
Swers obtained from each man as well as the vati 
of answers from the group as a whole. 

The six criteria for creativity used are presente 
Table 1. The last two criteria have already 
described, plems 

To score the answers to the engineering pro ank- 
a ranking procedure was used which involved in 0 
ing the whole set of answers given by each ma ave 
each problem, All the answers that each man i 
to each problem Were typed on a separate page as 
five carbon copies, and each sct of answer in 
ranked by five different engineers, All the raning 
used in this study were transformed by a ribu- 
that the ranking corresponded to a normal dy (for 
tion in which the mid-point of each of 10 cell yich 
example) was assumed to represent a point ormal 
was the center of 10% of the total area in & ee i 
distribution. The corresponding Z score n then 
point was determined, multiplied by 10, an Jowest 
rounded to the nearest whole number. The wel 
negative number Was called one and the others ach 
formed by adding the numerical distance © phor 
Successive number from its next lowest no bee 
This resulted in a scale in which the highest ”" 
was associated with the best performance. 

For each man in the study, the number of Pi 
disclosures submitted through company Ea ie 90 
the past year was available. However, of pany 
men in the study who had been with the Gas ie 
at least a year and so had an opportunity 


din 


Table 1 


Criteria for Creativity 


Performance Criteria 


T Rankings of answers to Shop problem 
m Rankings of answers to Mass Flow problem Jen? 
m Rankings of answers to Mechanization P¥° 
IV Number of Patent disclosures in past yea" 


Subjective Criteria 
Rankings of men by supervisors 
nkings of men by Peers 


Engineers’ Criteria for Creativity 


velop a patent disclosure, only 24 had had one or 
More patent disclosures. Consequently, each man 
was classified as having either none or at least one 
Patent disclosure. 

Reliability of the rating procedures. Inter-rater 
agreement in judging the answers to the engineering 
problems involved the correlation between two sets 
aes engineers. These two sets of engineers were 
Nad lomly formed from among the engineers who 
Th rated the answers to the engineering problems. 
a correlation of these raters within each of the 
b ec engineering problems was boosted as follows 
as factor of 2.5 using the Spearman-Brown cor- 
ie ee for the shop problem, from .69 to .85; for 
ike Mass Flow problem, from .52 to .73; and for 

e Mechanization problem, from .70 to .85. 

The data also permitted estimates of the extent of 
agreement among different engineers on which men 
ets creative. An estimate of the agreement of one 
pended selected peer with another similarly se- 
a peer in rating men gave a correlation of 55; 
fi this figure was boosted to a substantially higher 
Aree of 87 by using the Spearman-Brown prophecy 
a since more than five peers rated each indi- 
vidual. Slightly dissimilar conditions hold for the 
Stimated reliability of supervisors’ ratings of men. 
oe cases only one supervisor was able to rate 
rel EA in the department. Consequently, the cor- 
at ion here applies only to those departments where 
& i two or more supervisors had provided ratings 
ae e men as men. This correlation was 84, boosted 
in Ga <66. The agreement of supervisors with peers 
: rating the men was calculated by using the aver- 
age supervisor rating of each man, and the average 
peer rating on each man. The correlation obtained 

cre was .64, This correlation, recalculated using 
only those departments where at least two super- 
Visors had rated men, was .73. i 
po Test-retest estimates of reliability were obtained 
ned the supervisors, although they were not obtained 
wi the peers. These correlations, transformed by 
ea Z, showed an average test-retest correlation 
Development of content analysis categories. The 
$ ects on which both supervisors and peers had de- 
Scribed creative men were formed into seven sets 
With approximately 12 to 15 sheets in each set. Each 
of these randomly formed sets was given to a pair 
of graduate students in psychology. Each member 
of the pair worked independently, and each read 
Over the reasons and built a classification system, in 
Whatever detail he chose, which represented all the 
ideas that were contained in the responses. These 
Categories were examined by the author and com- 
ined into one master list comprising 55 categories. 
Bee categories can be condensed into the following 


Content analysis categories for creativity of men 


if Interpersonal relations 
ny Job satisfaction 
- Personal background and gener: 
acteristics 


al personal char- 


143 


. Problem related behavior 

. Problem type preferred and/or able to handle 
Approach used 

Manner of use of approach 

. Results achieved 

. Report of results 


Sopa 


An identical procedure was followed in forming 
categories for reasons why sets of answers were rated 
as creative, except that in this case the master list 
was put in a form similar to that developed for the 
analysis of reasons why men were judged creative. 

The investigator then classified all the actual rea- 
sons for the ratings of the men given by each of the 
supervisors and peers, and a rescoring of part of 
these data after an interval of one week yielded 
80% agreement as to the category to which it be- 
longed. A similar reliability check on the classifica- 
tion of the reasons why certain answers were judged 
to be creative yielded 82% agreement. 

In forming the classification system the investi- 
gator had assumed that good technical knowledge, 
for instance, characterized the creative person rather 


Table 2 


Rank Order of First Ten Variables in Reasons 
Why Men Are Creative 


Rank Frequency 
Order of Mention 


30 Is independent of others vs. needs 
guidance in problem solving. 


Description of Variable 


1 


Produces novel and unconven- 


2 27 
tional solutions vs. routine ones. 

3 23 Produces some solutions or one vs. 
no solutions. 

4 22 Prefers new and difficult problems 
ys. prefers routine problems. 

5 17 Produces practical and valuable 
solutions vs. impractical solutions. 

6 16 Analyzes a situation vs. doesn’t 
or can’t analyze. 

7 16 Shows energy and alertness vs. 
lacks energy- 

8 15 Produces many solutions or ideas 
vs. few solutions. 

9 14 Has a high degree of technical 
knowledge and academic achieve- 
ment vs. a low degree. 

10 13 Organizes and plans ahead vs. is 


unorganized and overlooks details. 


Note——Combining supervisors and peers, total N was 98, 


144 Thomas B. 


than poor technical knowledge, and such assumptions 
needed to be verified. Consequently, in the content 
analysis procedures, information was recorded with 
respect to which end of the continuum was actually 
attributed to the creative and which to the uncrea- 
tive person. Nine reversals occurred in judging the 
creativity of men, and most of these reversals took 
such forms as, “This uncreative Person has a high 
degree of technical knowledge, but... .” Only one 
reversal occurred in judging the creativeness or un- 
creativeness of sets of answers to the engineering 
problems. 


Results 


For the content analysis data, tests of sig- 
nificance, including chi-square analyses and 
Fisher’s exact probability test (Siegel, 1956), 
were run to determine if there were differ- 
ences between the supervisors and peers in 
the variables stressed. The data on which 
these analyses were made are not presented 
here, but the over-all conclusion was that 
there were no significant differences between 
supervisors and peers in the factors they em- 
phasize when they rate engineers as to their 
creativity. 

A similar procedure in analyzing the dif- 
ferences between supervisors and peers in 
their reasons why answers were judged to be 
creative was carried out. The over-all con- 
clusion for these data was the same as that 
for the data for the creativity of men: 


Outcomes of the 
Tables 2 and 3 present 


answers to engineerj 
off in frequency of 
categories was so s] 
ting point was est: 
tables, 

In addition to the content analysis results 
other information relating to the psychologi- 
cal and engineering tests used in the study 
should be reported next. As Previously men- 
tioned, one third of the men in the study were 
from service areas, one third from research, 
and one third from project areas. An analy- 
sis of variance design revealed the differences 
among these areas in their performance on 


Sprecher 
Table 3 
Rank Order of First Nine Variables in Reasons 


Rank Frequency 
Order of Mention 


Description of Variable 


tone vi 
Note.—Combining supervisors and peers, total N Y 


he 
the engineering tests to be significant att h 
.01 level. 
possible pair among service, project, 47" oy 
search areas were also significant at the in 
level. In the interests of anonymity, ea ft 
dividual results on areas will be reporte were 
is of interest to note that the problems "ne 
originally devised by supervisors in only jas 
of the areas. It was expected that this rea 
might later affect results so that this 
would perform the best on these pro 
However, the area for which these a 
Were developed by their supervisors Wê 
the best in its performance on them. yea” 


. . c 
tive men with ratings of performance agre” 
also be studied. In general, the level of a8 re 
ment is low, e 
lates peer and supervisor ratings of net 
the ratings of Performance of the sam 


Why Answers Are Creative 


1 34 Produces comprehensive and oe 
eralizable solutions vs. skei 
solutions. 

a jate 

2 27 Presents correct and approp” 
solutions ys. incorrect ones. 

7 ideas 

3 23 Produces many solutions oF id 
vs. few solutions. 

ven- 

+ 18 Produces novel and BR aed: 
tional solutions vs. routine 

" r evs- 

5 18 Produces some solutions or 0? 
no solutions. 

= n aches 

ó 17 Uses flexible and varied eram | 
vs. inflexible and narrow 0n 

5 uable 

í 16 Produces practical and y 
solutions vs. impractical s0 

; 1g, does 

8 13 Doesn’t blame others v nizes 
blame others and antag 
them, ò 

å roo" 

9 11 Ts interested in solution of P 


lems vs. avoids problems. 


as 53+ 


Ea ea 
Individual ¢ tests between re 


plem 


plem? 
not 


The agreement of ratings of men aS ould 


Table 4, among other datt to 


Engineers’ Criteria for Creativity 


145 


Table 4 


Intercorrelations of the Criterion Measures 


Performance Criteria 


Subjective Criteria 


Mass 


Flow Mech. Patent Sup. Peer 
ee o Prob. Prob. Disc. Ratings Ratings 
Performance Criteria 
Shop prob. .18* 25 .09 ll 07 
Mass Flow prob. .29** ar 14 .06 
Mech. prob. a 29% .26** 
Patent disc. see T haa 
Subjective Criteria 
6 


Sup. rating 
Peer rating 


Note.—N varii = m 
are Ote.—N varies from 90 to 107. 
P carson product moment correlations. 
aw Seatisticaily significant at the 5% level. 

atistically significant at the 1% level. 


ma the engineering problems. It shows that 
ut of six possible relationships, only two are 
Significantly related. 
“ae agreement of ratings of performance 
ie each other on the various problems was 
fie obtainable from the data. There were 
ree possible relationships among these three 
Problems, and these are also presented in 
Table 4, It seems obvious that while there 
is some relationship among these engineering 
Problems, correlations of the order of .2 in- 


Those correlations involving patent disclosures are poi: 


nt biserial r's. All the remainder 


dicate that they are not measuring the same 
thing. 

One other important question relates to the 
predictability of the various criteria used in 
the study. These results are summarized in 
Tables 5 and 6. In general, it seems that 
performance criteria are the most predictable. 

Of the engineering problems developed spe- 
cifically for this study, the mechanization 
problem is the most predictable. Four of its 
multiple R’s were significant at the .01 level, 


Table 5 


Multiple Correlations of Variou: 


s Combinations of Predictors with Criteria 


Ratings of Men as Criteria? 


Engineering Problems as Criteria® 
Multiple 
Combina- ae ee eae Combina- Multiple Multiple 
tions of Shop Mass Flow zation tions of Rw/ R w 

Predictors Problem Problem Problem Predictors Supervisors Peers 

12345 48 33 sa 12345678 37 40* 

1234 ‘16 33" 38** 1234567 35 30* 
123 16 .29* 36** 123456 34 30" 
12 16 .26* oar 12345 .34* gye 
l 1234 .34* 30% 
123 31* 35" 

12 .26* .29* 

ee o E o — gai 7 
adaptive flexibility; 4. associative fluency and 5, vocabulary 


originality; 2. idea fluency, 
mechanization problem; 


à Predictors are as follow 
pBredictors are as follows: 1. 
ares are as tote 7, mass fow problem; and 8. 


eey: o. 
ae Statistically significant at the ‘5% level. 
tatistically significant at the 1% level. 


Vocal 


yj 4. associative fluency; 5. 


. adaptive flex 


originality. 


146 


Table 6 


Efficiency of Cutting Scores Based on a Discriminant 
Function in Identifying Men With 
Patent Disclosures 


Of Those Patent No Patent 
Achieving Disclosure(s) Disclosure 
a Cut-off Were Held Was Held Total 
Score of by by N Was 
2300 7 0 7 
2000 13 11 24 
1700 18 25 43 
1400 22 43 65 
1100 23 64 87 
800 24 66 90 


the highest being a multiple R of .52 with a 
combination of the scores on originality, idea 
fluency, adaptive flexibility, associative flu- 
ency, and vocabulary. However, no cross- 
validation procedures were carried out on any 
of the problems, and the practical significance 
of this result is indicative rather than actual, 
since it would shrink to limited usefulness 


ance criteria are potentially predictable al- 
though there is ag. 


results would sta 
Using an arbitrary cutting score of 2300, 
seven out of the 
have been those wi 
a Cutting score of 


the dichotomized Predictand, 


nt disclosure in 
e R so obtained 


Discussion 


A previous survey of the literature in the 
area of creativity had revealed two factors 
that were predominantly mentioned. The 
first was novelty of ideas. The second was 
the ability to produce valuable and practical 


Thomas B. Sprecher 


solutions. Both of these factors appeared in 
this study in judging the creativity of men as 
well as in judging the creativity of answers. 
This lends some additional empirical confit- 
mation of the importance of these concepts- 
However, neither of these was at the ee 
of either list. Also, other kinds of variables 
appeared. Work-habit variables were 
quently placed in the top 10 in judging | f 
creativity of men. The most striking a 
these is the variable mentioned most 2 
quently, “independence of others.” In a 
opinion of the peers and supervisors in a 
study the creative person was characte ae 
by an ability to proceed on his own. pee 
work-habit variables were such character 
as a tendency to analyze a situation, ma" an 
energy, and an ability to organize and ee 
the details of a project. It is felt that mike 
variables, representing seemingly workman + 
approaches to a new task, have been is i 
neglected or overlooked in previous studies ee 
creativity and should þe exploited in fi re- 
work, particularly if we are interested in Pi 
dicting behavior judged to be creative bY 
gineers, +. of the 
The results of the content analysis © that 
reasons why answers are creative show wa 
the most frequently mentioned vapiebi i 
that the answers be comprehensive an ten- 
eralizable. It should be remembered tha one 
gineers were judging these answers an o! 
might assume that they have a liking n 
thorough, complete, and generalizable gi- 
swers. These were technical answers Ti 
neering problems judged by personnel a 
in engineering, Novelty of solution was "oy 
tioned by approximately half as mann wets 
as mentioned comprehensiveness of a" crea” 
This reversal of emphasis suggests that ip 
tivity in this setting has a different Mean g- 
than that assigned to it in the general the 
ture. In any case, it is important i ihe 
meaning of creativity be determined for ted: 
specific situation where it is to be be on 
If these results are not specific to u the 
industrial Setting, then future work S er% 
field of creativity could profit from co Jes: 
tion of these Potentially important vat a 
The occurrence of novelty of ani an 
category in judging both the creativity me? 
Swers and in judging the creativity ° 


jne 


Engineers’ Criteria for Creativity 147 


gives it special importance. However, the va- 
mety of ways in which novelty of ideas, or 
originality, may be measured is still largely 
unexplored. Insofar as the investigator knows, 
only Guilford (Guilford et al.: 1951, 1952) 
has done a factor-analytic study of a variety 
of Ways of measuring originality. Barron 
(1955) presented some evidence on the inter- 
correlations of various measures of originality, 
but his work did not demand that the dimen- 
Sionality of his tests or scoring procedures be 
determined. 

i Since the achievement of practical and valu- 
able answers appears characteristic of both 
creative men and creative answers, it would 
Seem worthwhile in future studies of crea- 
tivity to include a scoring procedure which 
rates answers as well as men on practicality. 
While it might be difficult to assign psycho- 
logical factors which would account for the 
Production of practical ideas, an applied study 
might gain considerably by using such a scor- 
ing procedure. 

The variable ranked eighth in the list of 
reasons why answers are creative presumably 
appeared because one of the technical prob- 
lems involved some aspect of interpersonal re- 
lations. Some of the men judged to be un- 
Creative were judged as such because they 
were antagonistic in their attitude towards 
others, 

Tests were of course also used for an in- 
trinsic interest in concurrent validity. The 
uninstructed judges used in the study did in 
Seneral show the agreement among themselves 
as to who or what they rated as creative 
Which is basic to concurrent validity. Several 
Judges, possibly at least three and this study 
Suggests five, are needed to obtain a fairly 
high level of reliability in terms of agreement 
on any one rating assigned to an individual. 
These results hold both for the supervisors 
and for peers. Also, these two groups by and 
large did agree on the rating they assigned to 
an individual, and the ratings seem to be 
Stable, since the supervisors who were re- 
quested to rerate these same men at a later 
date came up with essentially high test-retest 
Correlations. Although by the nature of their 
Supervisory responsibilities they would per- 
haps be expected to have this high reliability, 
it was an important verification. 


The caliber of the personnel involved in 
different work areas within an organization 
may have an important bearing on personnel 
assignment or selection, if the results from 
the engineering performance tests have gener- 
ality. Clear and consistent differences be- 
tween work areas appeared on these problems 
in spite of the bias in favor of one of the 
areas due to their development by a particu- 
lar supervisory group. 

One other major conclusion emphasizes the 
extent of differences between performance rat- 
ings and the more subjective over-all evalua- 
tions of men. There seemed to be little 
agreement between such measures, since the 
relationships, while significant in some in- 
stances, were low (see T: ‘able 4). In part this 
may have been due to the low interrelation- 
ships among the engineering problems them- 
selves, as Table 4 also suggests that a defi- 
nitely larger number, possibly of the order 
of 10, is necessary to provide a minimum 
sampling of the variety of technical tasks 


possible. 
Importance is att 

objective performance 

the more predictable. 


ached to the fact that the 
measures seemed to be 
Both patent disclosures 


and the engineering problems came closer to 
practical predictability than the more sub- 
jective measures used. However, it is likely 
that the ratings of the men as men could 
be better predicted by including work habit 
measures, since work habits, at least insofar 
as the verbalized reasons for the ratings were 


concerned, were related to the judgments. If 


this result holds up in further studies, impor- 
ld be made in pre- 


tant practical gains cou 
dicting such a criterion by the use of such 
variables. 

Summary 


ranked lowest. 
rankings on creativity 
open-end engineering problems. No signifi- 
cant differences in the bases for such judg- 
ments were noted between engineers and 
supervisors of engineers. 

The content analysis results verified a wide- 


148 


spread impression in the literature that the 
novelty and worth of ideas are important 
factors in creativity. It also brought out 
other factors which had been largely over- 
looked in the literature, such as independence 
in problem solving and the achievement of 
comprehensive answers. 

The various criteria used indicate a fair 
level of agreement in these untrained raters 
as to which products or which people are 
creative, although there may be basic differ- 
ences in the meaning of the word creative 
according to its use as an over-all rating of 
a person or whether it is a rating of specific 
engineering performance. There also seems 
to be more practical predictability in the per- 
formance ratings than in the more subjective 
over-all ratings, granting two qualifications. 
First, since the ratings of the total impression 
of an engineer were affected by work-habit 
characteristics, inclusion of these variables as 


Thomas B. Sprecher 


predictors might increase the predictability of 
such a criterion. Second, if performance rat- 
ings of engineering products are used, a fairly 
large sample of the various kinds of problems 
is required. 


Received July 30, 1958, 


References 


Barron, F. The disposition toward originality. is 
abnorm. soc. Psychol., 1955, 51, 478-485. y 
Guilford, J. P., Wilson, R. C., Christensen, P. R, È 
Lewis, D. J. A factor-analytic study of creative 
thinking: I. Hypotheses and description of me 
Los Angeles: Univer, of Southern California, 195) 
Psychol. Lab. Rep., No. 3. R 
Guilford, J. P., Wilson, R. C., & Christensen, Pe 
A factor-analytic study of creative thinking: 
II. Administration of tests and analysis of rest 
Los Angeles: Univer. of Southern California, 195% 
Psychol. Lab. Rep., No. 8. soral 
Siegel, S. Nonparametric statistics for the behavior 
sciences, New York: McGraw-Hill, 1956. 


Journal of Applied Psychology 


VoL. 43, No. 3 


June, 1959 


THE EFFECTS OF TOP AND MIDDLE MANAGEMENT 
SETS ON THE GHISELLI SELF-DESCRIPTION 
INVENTORY 


ROGER A. KAUF. 


MAN, KARL L. HAKMILLER,! anp LYMAN W. PORTER 


University of California 


ee article by Porter and Ghiselli 
dione. 6 explored the differences in self-percep- 
inh etween top management and middle 
aae personnel employed in a wide 
is ma of industries. The instrument used 
a the data was a self-description in- 
21 of (SDI) developed by Ghiselli (1954); 
tory oan 64 forced-choice items on the inven- 
fe Ss erentiated between the two levels of 
at a Top management individuals, 
mane compared to those of middle manage- 
wie ea themselves as more active, self- 
iffe , and enterprising. The 21 items that 
Eaa between the two groups were 
and a scale values and used by Ghiselli 
Ter odahl (1958) in a small-group experi- 
i nt. The scale derived from these items 
(Di labeled a Decision-Making Approach 
aa scale, since the items seemed to de- 
nak e primarily differences 1n the ways the 
management groups attack problems. 

oe raw data for the DMA scale of the 
tee were based on the self-perception descrip- 
Pa eg by top and middle management 
k ividuals in situations in which they had no 
howledge of the ultimate use of their answers 
to the items on the inventory; in other words, 
ry of these management individuals knew 
that they were contributing data to a study 
3 differences between top and middle man- 
ra personnel. Therefore, although some 
T ividuals were undoubtedly tryms to make 
Ge ives look as favorable as possible on 
an scale, or were operating under some spe- 
5 c self-induced “set,” there was nO genera 
— 

1 Now at the University of Minnesota. 


149 


“top management” set operating for all of 
the highest level personnel, or “middle man- 
agement” set for individuals in the lower 
level positions. 

The over-all purpose of the present study 
was to determine the effect of specific “top” 
and “middle” management sets on the DMA 
scale of the SDI and on the other scales 
developed by Ghiselli in connection with the 
SDI. Ideally, the Ss for such a study should 
be people holding middle and top manage- 
ment positions in business and industry. Be- 
cause of the difficulties in gathering data from 
a large enough sample of management per- 
sonnel at those levels, it was decided to use 
a more available group—college males. Since 
the primary purpose of this study was 
not to ascertain the differences in opinions 
between top and middle management indi- 
viduals, but rather to determine the effect of 
top and middle management sets on the SDI, 
it was felt that individuals who were reason- 
ably familiar with the business world either 
through experience, education, or business 
contacts would be able to respond appropri- 
ately to the specific sets. 

The following specific effects of top and 
middle management sets were investigated: 
(a) a change in level of scores on the DMA 
scale of the SDI, when Ss are operating under 
a top management set in contrast with oper- 
ating under a middle management set; (0) the 
degree and direction of the correlation be- 
tween top and middle management sets on 
the DMA scale; (c) the spread of influence 
of top and middle management sets to the 


150 


other five scales in addition to the DMA 
which have been developed for use with the 
SDI; these other scales include ones for Ini- 
tiative, Intelligence, Occupational Level, Self- 
Assurance and Supervisory Qualities. 

The above paragraph presents the primary 
effects that were investigated. One additional, 
but secondary, effect was also studied. A 
variation in set was introduced for each man- 
agement level in order to compare descrip- 
tions when Ss tried to place themselves in 
the roles, with descriptions when they tried 
to picture how others actually fill the roles. 
Thus, one top management set was couched 
in terms “if you were a top management 
man,” and another top management set used 
terms that in effect asked the S “how would 
a top management man” (fill out the inven- 
tory). Identical variations were used for 
the middle management level. 


Method 
Subjects 


The Ss were 44 male undergraduates en- 
rolled in psychology courses, chiefly upper 
division classes in industrial psychology. 


Procedure 


The SDI was administered to the Ss by presenting 
a booklet containing a general instruction sheet and 
Specific instruction sheets to be used with the SDI 
when filled out under each of five “sets.” The first 
Page of the booklet told the S: “In this booklet you 
will find several Copies of a Self Description Inven- 
tory for you to fill out. The sheet before each copy 
of the inventory has instructions for you to follow 
in filling out that copy. Execute each inventory in 
the order in which they are arranged. Be sure to 
follow the instructions on the sheets in front of 
each inventory.” 

The instructions for each of the five sets used in 
the study were as follows (the labels being omitted 
on the instruction sheets) : 

Self set: “The Purpose of this inventory is to 
obtain a picture of the traits you believe you possess 
and to see how you describe yourself. There are no 
right or wrong answers, so try to describe yourself 
as accurately and honestly as you can.” 

“Would Top” set: “Fill out this next inventory 
as you think a top management man would fill 
it out” 

“If Top” set: “Fill out this next inventory as you 
would if you were a top management man” 

“Would Middle” set: “Fill out this next inventory 


as you think a middle management man would fill 
it out.” 


; 


R. A. Kaufman, K. L. Hakmiller, and L. W. Porter 


“If Middle” set: “Fill out this next inventory a 
you would if you were a middle management man. 

The Self set was always administered first, in order 
to avoid any possible contamination from the other 
sets. The order of the four management sets was 
counterbalanced to control for order effects among 
these sets. No definition of top or of middle man- 
agement was given to the Ss, each S being left free 
to make his own interpretations of the terms. This 
was done in order to avoid over-attention by the 
Ss to a particular definition of the terms; it was 
felt that the terms were sufficiently well known to 
all Ss. 


Each inventory filled out under each of the five 
sets was scored for the six scales developed from the 
SDI: Decision-Making Approach, Supervisory Quali- 
ties, Occupational Level, Intelligence, Initiative, 4" 
Self-Assurance. 


Results and Discussion 


Table 1 presents the means and standard 
deviations for each scale on each of the 
sets. Since each S completed the SDI unde 
all five sets, and each SDI was scored on 4 
six scales, the means in Table 1 are base 
on an N of 44. Table 1 also indicates thé 
t-test results for differences between the means 
of the Self set and each of the managemen 
sets on each scale. of 

Table 2 presents the results of analyses 
variance performed on each scale, using ple 
data from the four management sets. Tal If 
3 shows the correlations among the Self, ix 
Top, and If Middle sets for each of the 5 
scales, data 

Tables 1 and 2 present the necessary wre 
for evaluating the first effect being ge 
gated in this study. The means for the E 
management sets presented in Table 1 aale 
the analysis of variance for the DMA poe 
using these means presented in Table 2 sets 
onstrate that the two top management o 
produced significantly higher mean score’ ale 
the DMA scale compared with the two a“ A 
management sets. Thus, it appears that} e 
viduals operating under a top manage ai 
at the time of taking the SDI can ° are 
significantly higher scores than if they set 
operating under a middle management i Je 
Also, the tests of significance given in top 
1 indicate that for these Ss, both of g A 
and one of the middle management sets $ o 
significant increases on DMA scores in © 
parison with a self set. 


di 
The second effect that was evaluate 


Ghiselli Self-Description Inventory 151 
Table 1 
a Means and Standard Deviations 
Sets 
SDI Sca arf M “Would “If “Would 
- cales E Self Top Top” Middle” Middle” 
M4 = — 
TA x 17.61 20.80** 21.41** 19.00 19.75* 
o 4.49 4.75 5.00 4.70 4.11 
Tnitiativ < 
nitiative X 27.89 36.09** 35.98** 32.85** 35.14" 
o 5.44 6.62 5.96 7.14 6.56 
Tntelli = 
ntelligence } 41.57 43.84 44.36* 39.73 39.98 
o 7.88 7.76 6.20 7.33 7.38 
Occupational Level = 35.41 44.50** 44.75** 39.84** 41.36** 
o 8.45 8.08 7.72 8.32 9.51 
Self-Assurance x 26.50 31.84** 33.32** 29.82** 29.59** 
o 5.22 5.08 5.02 5.64 4.92 
Supervisory Qualities X 28.16 30.39 30.93* 26.59 26.64 
7.49 6.47 5.97 7.25 5.66 


* Signific: 
ignificant from Self mean at .05 level. 
Significant from Self mean at .01 level. 


t 
A was whether there was a significant 
ment ion between top and middle manage- 
Baca on the DMA scale. Since the 
TAR of variance performed on the man- 
Kowed sets for the DMA scale in Table 2 
“gp no: significant differences between the 
Mia ae Would” sets, the If Top and If 
and e sets were used to represent the top 
ieee management sets for the correla- 
that Fn ai in Table 3. Table 3 shows 
of 501 y is a significant positive correlation 
r iA between individuals’ scores On the 
to scale when the inventory is taken under 
P set and the DMA scores for these indi- 


agement set. This positive correlation indi- 
cates that an individual’s position on the 
DMA relative to the positions of other indi- 
viduals tends not to be affected by the par- 
ticular set operating for all individuals. That 
is, those individuals who scored high on the 
DMA under a top management set also 
tended to score high on the DMA under a 
middle management set, and likewise those 
who scored relatively low under one set were 
also low under the other set. 

Table 1 and Table 2 provide data to test 
whether the top and middle management sets 
affect other scales jn addition to the DMA 
Krug (1958) has recently demon- 


Vid š 
uals when operating under a middle man- scale. 
Table 2 
——— Analyses of Variance of Four Management Sets on Each SDI Scale 
Occupational Self- Supervisory 
DMA Initiative Intelligence Level Assurance Qualities 
Source “as E us. F MS E MS) OF Ms P MS F 

M — 

Wus Level 1 131.28 8.95* 182.09 5.03% 794.75 24.11% 712.02 16.09% 369.57 23.36* 719.63 26.86* 
ubj sate 1 20.46 1.39 56.31 1.42 656 = 3457 — 15.96 1.01 3.38 — 
EUA 8 jess 2.65% 6483 1.79% 107.85 3.27% 151.82 343* 59.13 3.74% 8186 3.06% 

- Level 
Regn ulate 1 018 — 62.62 1.73 082 — 1782 — 30.17 1.91 321 — 
ual 129 14.67 36.18 32.96 44.24 15.82 26.79 


Stone 
Significant at .01 level. 


152 R. A. Kaufman, K. L. Hakmiller, and L. W. Porter 
Table 3 
Correlations Between Sets Within Scales pe” 
Self/ Self/ “if Top”/ 
Scale “IF Top” “If Middle” “Tf Middle” 
DMA .413** .273 Soit* 
Initiative .359* 333" 173 
Intelligence .484** .418** i 
Occupational level .436** 377* A 
Self-assurance .461** .441** .629** 
Supervisory Qualities .399** .294* .549** 


* Significant at <.05. 
** Significant at <.01. 


strated that if Ss are given specific sets in 
accordance with some of the SDI scales, the 
effect of a set will be to raise the score on 
that scale and to generalize to some of the 
other scales in the inventory. As can be seen 
by the means in Table 1 and the analyses of 
variance in Table 2, Ss scored significantly 
higher on all of the other scales when filling 
out the SDI with top management sets. Also, 
Table 1 shows that on three of the six scales 
—Initiative, Occupational Level, and Self- 
Assurance—all management sets caused a 
significant increase in scores compared to 
the Self set. On the DMA scale three of the 
four sets produced significant increases; only 
on the Intelligence and Supervisory scales did 
a majority of the sets fail to increase signifi- 
cantly the scores. In only 4 of 24 instances 
did the Self set produce higher means, and 
none of these was significantly higher. These 
findings, in general, confirm those of Krug 
regarding the spread of a particular set's in- 
fluence. In the present study, a top manage- 
ment set produces greater spread than does 
a middle management set. 

As previously mentioned, 
sented in Table 2 show that 
study did not give different r 
sets that attempted to get at how they would 
picture themselves in the management posi- 
tions (the “If” sets), and how they perceive 
others in these positions (the “Would” sets). 
The particular college males contained in our 
sample apparently feel they would fulfill the 
roles about as they preceive others fulfilling 
the roles. 


Taken as a whole, the results of this experi- 


the data pre- 
the Ss in this 
esults between 


ment indicate that a top management set ns 
raise scores on all scales of the SDI W or 
compared with a middle management ae 
a self set. However, the positive correlati ets 
between scores obtained under different i 
also show that a top or middle manage? ate 
set does not greatly alter the relative Pier 
tions of individuals on the DMA ot a 
scales of the SDI (except, possibly, the vle, 
tiative scale). This indicates, for examP © 
that if an individual assumes a top mei 
ment set while others assume a self se rela- 
can raise his scores on the various scales i 
tive to the others’ scores on the scales; 
however, the others used a top set eon 
a self set, the individual’s relative a it 
would not be greatly different from W"* gf 
would have been if everyone had used : 
set. The implications of these rest tin 
organizations using the SDI as a Se te 
device would seem to be that for the ș for 
of sets used in this study the prediction” ted 
a given individual will not be strongly 4 ers 
by the particular set he uses as long a8 tu 
do not use entirely different sets. I” spe 
tions where predictions are desired ae, ua 
cific individuals (as contrasted to those Jarg? 
tions where composite descriptions 0 pe 
groups are desired), it would seem et for 
important to try to induce a uniform § ould 
all individuals. Any uniformity that ortant 
be produced should be at least as imP 

as the nature of the particular set. 


iy 


Summary 


Forty-four male undergraduates r 
chiefly from courses in industrial ps¥ 


awh 
olof 


Ghiselli Selj-Description Inventory 


were given the Ghiselli Self-Description In- 
ventory to execute under five different “sets”: 
a Self set, two top management sets, and two 
middle management sets. Each inventory 
given under each set was scored on the six 
soles of the SDI; emphasis, however, was 
given to the results for the Decision-Making- 
Approach scale which had been derived previ- 
y from self descriptions of top and mid- 
le management personnel. The major re- 
sults of the present study were: 


1. Top management sets produced signifi- 
cantly higher mean scores on the DMA scale 
than did middle management sets. 

i 2. There was a significant positive correla- 
ion between top and middle management 
Sets on the DMA scale. 


153 


3. Subjects tended to score significantly 
higher on the other SDI scales when operat- 
ing under top management sets. 


Received August 25, 1958. 


References 


Ghiselli, E. E. The forced-choice technique in self- 
description. Personnel Psychol., 1954, 7, 201-208. 

Ghiselli, E. E., & Lodahl, T. M. Patterns of mana- 
gerial traits and group effectiveness. J. abnorm. 
soc. Psychol., 1958, 57, 61-66. 

Krug, R. E. The effect of specific selection sets on a 
forced-choice self-description inventory. J. appl. 
Psychol., 1958, 42, 89-92. 

Porter, L. W., & Ghiselli, E. E. The self percep- 
tions of top and middle management personnel. 
Personnel Psychol, 1937, 10, 397-406. 


Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


EFFECTS OF MASSED PRACTICE AND THICKNESS 
OF HANDCOVERINGS ON MANIPULATION 
WITH GLOVES 


HILDE GROTH ax» JOHN LYMAN 


University of California, Los Angeles 


Prior work in this laboratory has indicated 
a consistent relationship between the amount 
of prehension force applied to a manipulated 
object and the coefficient of friction between 
the object and the handcovering (Groth & Ly- 
man, 1958; Zweizig & Lyman, 1957). How- 
ever, results of these experiments did not de- 
lineate performance changes which might be 
induced by thickness of handcovering mate- 
rials or by massed practice. The present ex- 
periment accordingly was carried out to ex- 
amine the specific effects of these variables. 

A search of the literature provided only 
scanty information concerning the effects of 
practice upon gloved performance. Teichner 
and collaborators (Teichner, Kobrick, & 
Dusek, 1954) reported improvement of speed 
on the Minnesota Rate of Manipulation Test 
for three types of gloves without finding such 
a trend for the bare hand. Their results seem 
“. . . to support the notion that with suffi- 
cient practice difficult glove tasks become con- 
siderably reduced in difficulty to the point 
where relative impairment might possibly be 
very small” (Teichner et al., p. 24). Evi- 
dence for improvement of skill on two other 
manipulatory tasks while weari 
also been foun 
(1947). 


r-all learning 

with gloves, 
Since our previous investigations have indi- 

cated a fair relationship bety 


ween task stress 
and prehension force levels we expected to 


1This investigation was supported by QM Con- 
tract No. DA 44-109-9M-1531 between the U: S: 
Army QM Corps and the University of California, 
Los Angeles. The opinions expressed are those of the 
authors and do not necessarily reflect those of the 
contracting agency. 


find a gradual increase in applied force from 
start to finish of a massed trial (Groth, 1957). 
However, studies by other investigators (Hov 
land, 1951; Telford & Swenson, 1942) Pe 
variations of tension in the muscles pat. 
to a task have shown little agreement as z 
predictability of tension increases or dara 
attributable to practice during motor pri 
learning. Therefore, our primary interes a 
this investigation was directed toward eva rse 
tion of performance changes during the cou in 
of a prolonged trial as well as changes ere 
absolute level when comparing over-al Fel 
formance on short trials with those of ve 
duration. Simultaneously, we explore ring 
effects of thickness of selected handcovet?? 
materials on three criterion measures % 
nipulatory skill. 


Procedure engi 
e 
Subjects, Twenty-four male undergraduat 


i cts. 
neering students served as experimental sub)” trolled 
Apparatus and task, The electronically-¢ al pre" 


display and control matrix and the cylindrica de- 


i n 
hension force transducer have been describe task, 
tail elsewhere (Groth, 1957). In a self-pa' # ec ess 
the Ss were asked to place the cylinder into 3 catio” 


in the control board which corresponded ee ent 
to a light in the display matrix. Cylinder Pi ig the 
completed an electrical contact which swit ink s 
display light to another location, This ligh t 
quence appeared random to the Ss. rded 3 

The following measurements were pera of PE 
three-minute intervals: (a) the time integra “| mes 
hension force, (b) the sum of the transpo” 
(c) the sum of the cylinder transports. of glov? 

Changes in surface friction and thickness dco 
material were controlled by the choice of hal 
ings with the following characteristics: 


bare hands, # = .73, thickness 028 mm 

knit cotton gloves, = .25, thickness oa mm 

army leather gloves, 4 = .55, thickness Re 

army arctic mittens 3.50 oP 
with fleece liner, u= 55, thickness 2° tbe 

n 

The coefficients of surface friction betwee n 

handcovering and aluminum were determin’ me 

drag method described previously (Groth 


154 


S 


Manipulation with Gloves 


155 


Table 1 


Summary of 


F-values of Analyses of Variance for Practice Effects 


Prehension Force 


Number of Transports Time per Transport 


Handcovering Right F Left Right F Left Right F Left 
Arctic Mitten 04 04 11 10 Al 14 
Cotton Glove 30 1.42 13 08 07 1.43 
Leather Glove 3.14* 56 AS 26 74 63 
Bare Hand 39 13 30 1.36 1.03 .83 


*P<.05. 


1958). Thickness of the uncompressed gloves was 
Measured with calipers. 

Contact area during mani 
by instructions for grasping the cylinder. 

Routine. The experiment was conducted during 
October, 1957. Room temperatures ranged from 25°C 
to 30° C, The Ss were assigned randomly to four 
experimental groups, each S performing with one 
type of handcovering only. After familiarization, 
te task was administered to both the right and the 
eft hands while in a standing posture. The se- 
quence of right-left administration was counterbal- 
ae within each experimental group. Each S came 
or one experimental session, performing 30 min. 
With each hand. The two runs were separated by a 
3 min. rest pause during which the Ss were required 
to sit down. 

Subjects performing with the bare hand dried their 
hands thoroughly with a turkish towel before the 
run, Perspiration and any consequent changes in 
Surface friction could not be controlled during bare 
hand performance. 

Calculations. 1. The effects of massed practice on 
Prehension force, number of transports, and time per 
transport were assessed by analysis of variance. 

Omparisons were made for the readings taken at 
» 12, 18, 24, and 30 min. of performance. ; 

2. The effects of handcoverings on prehension 
Orce, number of transports, and time per transport 
Were assessed by analysis of variance. Comparisons 
Were made among the mean values for the full 30- 
min. period. 

Differences between individual treatments were 
evaluated by Duncan’s (1955) Multiple Range Test 


w : 
hen appropriate. 
he significance level was set at 


jpulation was controlled 


P < 05. 


Results * 


Effects of practice on gloved pe 


Titerion measures: (a) Mean 
a 


r formance. 
prehension 


this Aree : 
merican Documentation Institute. Rea 
Ment No. 5886 from ADI. Auxiliary Publications 


Toject, Photoduplication Service 
Bress, Washington 28, D. Cy remitting in advan 
-75 for microfilm or $2.50 for photocopies. Make 


force (PF), obtained by dividing the integral 
of force by total transport time. (6) Total 
number of transports during the 30-minute 
test trials. (c) Time per transport, obtained 
by dividing total transport time by total num- 
ber of transports. 


1.0 
k BARE HAND 
$ 8 COTTON A” 
GLOVE x 
z VE TEATHER GLOVE 
4 
=£ # A 
& ARCTIC MITTEN 
a 4 
a 
[=] 
Š 2 
Bs 
W 
a 
= © 
= o 2 «= © #8 N 
w 
= COEFFICIENT OF FRICTION 


COTTON GLOVE 
x 


140 o 
ARCTIC MITTEN 
130 
BARE HAND 
120 x 
LEATHER GLOVE 
110 


1.0 


MEAN NUMBER OF TRANSPORTS 


o .2 4 6 .8 
COEFFICIENT OF FRICTION 


Mean performance and regression lines for 
right-hand performance. 


Photoduplication Service, 


Fic. 1. 


checks payable to Chief, 
Library of Congress. 


Hilde Groth and John Lyman 


Table 2 


Mean Prehension Force and Variability 


Coeff. of Right Hand Left Hand 
Thickness Friction = = F 
Conditions (mm.) (u) X (gms.) s (gms.) X (gms.) s (gms.) 
Bare Hand — 0.73 1102 196 1214 A 
Cotton Glove 0.25 0.25 2674 829 2140 ioe 
Leather Glove 0.75 0.55 1514 216 2018 is 
Arctic Mitten 3.50 0.55 3354 385 2535 2 


The statistical analyses of performance 
changes as a function of time failed to reach 
significance with one exception, PF for right- 
hand performance with the leather glove. 
Table 1 reports the summary of the F values 
of the analyses of variance. However, in- 
spection of the plotted data indicated that 
these fell into two distinct groups for all 
handcovering conditions: “poor” performance 
and “good” performance. For prehension 
force “poor” performance was obtained with 
arctic mittens and cotton gloves, and “good” 
performance with bare hands and leather 
gloves. The diametrically reversed conditions 
were found for number of transports and time 
per transport. Regression lines were deter- 


mined graphically for these two performan’™ 
groups (Askovitz: 1955a, 1955b). These af 
shown for the right hand in Fig. I. the 
Little change in prehension force during as 
trial can be detected from the regression ae 
Furthermore, practice seems to exert oppos 
effects upon the two groups. ns- 
Regression lines for the number of is R 
ports and the time per transport show 4 IE 
but consistent trend of performance faci at 
tion from the beginning to the end of the P" 
tice session for both groups. nce. 
Effects of handcoverings on performa ion 
Criterion measures: (a) mean prenen a 
force, (b) total number of transports» and 
time per transport. The grand means 


Table 3 


Mean Time per Transport and Variability 


. Coeff. of Right Hand Left Hand _ 
S Thickness Bichon a Seance 
Conditions (mm.) (u) X (sec.) s (sec.) X (sec.) $ pen 
Bare Hand — 0.73 “0.84 0.26 089 me 
Cotton Glove 0.25 0.25 0.67 039 a6 27 
Leather Glove 0.75 0.55 075 0.15 077  % 
Arctic Mitten 3.50 0.55 0.55 0.48 0.55 0.41 
Table 4 
o ESNS Mean Number of Transports and Variability eee 
= = —— 1 
Coeff. of i Ha Left Han? 
Thickness Friction Right Hand ee 
Conditions (mm.) (u) Fa a Š i 
a BA á i ee fe 
Bare Hand = 0.73 121 19 105 20 
Cotton Glove 0.25 0.25 139 22 137 12 
Leather Glove 0.75 0.55 117 11 115 19 
Arctic Mitten 3.50 0.55 138 18 145 


Manipulation with Gloves 157 


Table 5 


Analysis of Variance for Prehension Force 


Source of Sum of 
A F Mean 
Variance df Squares Squares F 
Right Hand 
Treatments 3 19,365,000 6,455,000 4.82* 
Within 2 2 "340, ie 
20 26,802,000 1,340,000 
Total 23 46,167,000 
Left Hand 
Treatments 3 5,532, 4 
Within 20 123181000 1316000 ea 
E Total 23 17,851,000 
*P <05. 7 


= respective variabilities were taken as the 
to ae for the statistical analyses in order 
Sates greater stability by reducing the 
2.3 ility. These data are reported in Tables 
: 3, and 4. 
ace significance was not obtained for 
idioat, yses but the graphical representation 
covert es a consistent relationship of the hand- 
ahal ings to the coefficient of friction. These 
Mines are reported in Tables 5, 6, 7, 8, 9. 
show er of transports and time per transport 
= w some performance facilitation with a de- 
ase in the coefficient of friction whereas 
inno force shows an impairment of per- 
rmance with such a decrease. Performance 
oe the fleece-lined arctic mittens corre- 
op to performance of a thin handcover- 
aie a very low coefficient of friction for 
ree criterion measures. Figures 2 and 3 
Present these results graphically. 


Discussion 

The effects of massed practice obtained in 
i study fall into two categories: those per- 
aning to performance changes during the 30 
Minutes of performance and changes of over- 
all performance level when compared to stud- 

les of shorter trial duration. 
tae force showed no 
during the 30 minutes. These results 
appear to be consistent with Telford and 
Swenson’s (1942) later trials during mirror 
tracing practice. Perceptually, our task was 


evidence of a 


much simpler than mirror tracing which 
might account for the absence of an increase 
in muscular tension toward the end of the 
trial. 

Performance speed and total output showed 
a trend indicating slight performance facilita- 
tion with practice. However, this trend was 
obtained for gloved manipulation as well as 
for the bare hand condition. Since prior work 
with the same apparatus had never shown any 
practice effect for bare hands when trials were 
separated by several days, we felt reasonably 
assured that no such changes would occur 
during the course of a single prolonged trial 
(Groth, 1957). Our results show, however, 
that we encountered a similar problem to that 


Table 6 


Duncan’s Multiple Range Test for Difference in 
PF Between Any Two Treatments for 
Right-Hand Performance 


Shortest 
Significant 


Compari- Ranges Obtained 
P 


Conditions sons = 5% Ranges 
Bare Hand A D-A 1049 22528 
Leather Glove B D-B 1021 1840* 
Cotton Glove C D-C 971 680 
Arctic Mitten D C-A 1021 1572* 

C-B 971 1160* 
B-A 971 412 


+p <05. 


158 Hilde Groth and John Lyman 


Table 7 


Analysis of Variance for Number of Transports 


f Sum of Mean 
anh df Squares Squares F 
Right Hand sis 
ts 3 2,800 930 -07 
Wate 5 20 17,350 870 
Total 23 20,150 
Left Hand 
Treatments 3 6,380 2,130 2.88 
Within 20 14,720 740 
Total 23 21,100 


of Blair and Gottschalk in their study and we 
cannot, therefore, attribute a certain amount 
of performance facilitation to “learning how 
to perform with gloves,” 

When comparing the results for the over- 
all trial length with such scores taken previ- 
ously on trials of shorter duration, we found 
several interesting changes (Groth & Lyman, 
1958). Prehension force as a function of sur- 
face friction showed the same trend as before 
but a considerable increase in absolute force 
level was recorded (Fig. 2). An explanatory 
hypothesis for this finding probably should be 
sought in terms of an “Einstellungseffekt” re- 
sponsible for raising the general tension level 


because of knowledge of the prolonged trial 
period. 


Speed and total work output both increased 
with a decrease in surface friction in et 
study. A possible post hoc explanation mis", 
be suggested in terms of “oyercompensatio E 
The S might try to offset these nonoptin# 
conditions by “working faster.” Hower 
both of these hypotheses need empirical a 
dation. Our prior investigation failed to $ i 
any predictable relationship between spe 
and surface friction and indicated a ae 
decrease in total output for conditions it 
tremely low surface friction associated ee 
a thin Coating of a silicone grease on the $ 
face of the fingers and gloves. me 

Thickness of handcovering material becas A 
important for the extreme condition, na™ in 
the arctic mittens. In terms of changes 


Table 8 
Analysis of Variance for Time per Transport A 
Source of S 
r um of Mean 
ariance df Squares Squares F 

Right Hand 

Treatments 3 273 091 2.76 

Within 20 657 .033 

Total 23 -930 
Left Hand 

tnet 3 376 125 4.46" 

Within 20 563 028 

Total 23 939 

*P < 05. 


a: 


Manipulation with Gloves 


quality of performance, the mittens were 
equivalent to a thin handcovering with a very 
low coefficient of friction. This held true for 
all three measures. 

The results of this investigation supported 
our earlier findings demonstrating the impor- 
tance of characteristics of surface friction and 
bulkiness of material for the design of pro- 
tective handcoverings. We did not find any 
evidence for the assumption that effort can be 
reduced by practice on the type of manipula- 
tion task we used. We would like to em- 
phasize the relatively low skill level required 
by our task, however, and point out that this 
may have been a large factor in our results. 

This problem of task specificity has been 
brought out very clearly in a report by Brad- 
ley (1957) who investigated certain glove 
characteristics on control manipulability. He 
found a considerable interaction of perform- 
ance with a large variety of gloves and the 
type of control operation required. 


159 
Table 9 


Duncan’s Multiple Range Test for Difference in Time 
per Transport Between Any Two Treatments 
for Left-Hand Performance 


Shortest 
Significant 
Compari- Ranges Obtained 
Conditions sons P = 5% Ranges 
Bare Hand A A-D .318 .34* 
Leather Glove B A-C .310 -24 
Cotton Glove C A-B .295 12 
Arctic Mitten D B-D .310 22 
B-C .295 .12 
C-D .295 10 
*P < 05. 
Summary 


This study was designed to evaluate the im- 
portance of surface friction and thickness of 
handcovering materials during prolonged ma- 


nipulatory performance. 


O-=-0O%uUST sup" 
CURVE 


@—-—x TRIAL DUR- 
ATION 3 


MINUTES 
x————Xx TRIAL DUR- 


ATION 30 
MINUTES 


= 
AND SILICONE 


GLOVE AND WAX 


MEAN PREHENSION FORCE (GMS) 


f we oy 5 


Fic. 2. Relation of 


z 
COEFFICIENT OF FRICTION (0 


coefficient of friction to mean 


e 9 10 H L2 13 14 15 


prehension force. (Right-hand performance.) 


160 


4000 


w Oo 
22 3 
o 
o O 8 


5 a 
o o 
o o 
ooro 


MEAN PREHENSION FORCE — GRAMS 
a 
o 
© 


6 


ARCTIC MITTENS 
COTTON GLOVES 
LEATHER GLOVES 
BARE HANDS 

-O= GRAND MEANS 
=== REGRESSION LINES 


Hilde Groth and John Lyman 


MEAN SECONDS PER TRANSPORT 


24 30 


18 24 30 ° 6 2 18 


TIME IN MINUTES 


Fic. 3. Relation of coefficient of friction to time 


transports. 


Surface friction and thickness of material 
were controlled by the following experimental 
conditions: 


bare hands, » = .73, thickness = 0 

knit cotton gloves, p= .25, thickness = 
0.25 mm. 

army leather gloves, pp = .55, thickness = 
0.75 mm. 


army arctic mittens, fleece line, w= .55, 
thickness = 3.50 mm. 


Manipulatory skill was evaluated by three 
criterion measures: mean prehension force, 
total number of transports, and mean time 


per transport. The measures were taken at 
three-minute intervals, 


Twenty-four male Ss performed a simple 
manipulation task of 30 minutes’ duration. 
The Ss were randomly divided into four 
groups of six Ss each. Each group performed 
with one type of handcovering only, 

Analysis of the results failed to show a time 
trend for prehension force, but regression lines 
for the number of transports and for time per 
transport indicated a slight but consistent im- 
provement of performance throughout the 30 
minutes. The four experimental conditions 
fell into two fairly distinct performance 
groups which could be classified into “poor” 
performance and “good” performance. 

Graphical comparison of mean prehension 
force on the prolonged trial and on the short 


per transport and number of 


(Right-hand performance.) 


trials of our previous investigation showed 5 
considerable increase in absolute level for 
30 minute run. 

All three criterion measures were directly 
affected by change in surface friction, ce 
a lesser extent by thickness of the Lena 
Performance with the arctic mittens aoe’ 
sponded to performance with a thin mel 
covering with a very low coefficient of frict 

The results were discussed in relation a 
other studies and to practical implications 
the design of protective handgear. 


Received July 7, 1958. 


References 


à ean 
Askovitz, S. I. Rapid method for determining 721 
values and areas graphically. Science, 1953, 
212. (a) d 
Askovitz, S. I. Mean rates of change an mi 
Squares—interpretations and rapid gapti 
ods. J. appl. Psychol, 1955, 8, 347-352. (D) onal 
Blair, E. A., & Gottschalk, C. W. Efficiency oy Med- 
Corps operators in extreme cold. U. S. Arm) 
Res. Lab. Rep., 1947, Rep. No. 2. sng con” 
Bradley, J. V. Glove characteristics influencing Pop, 
trol manipulability. USAF WADC Tech. 
1957, No. 57-389. in ii 
Duncan, D. B. Multiple range and multiple 
Biometrics, 1955, 11, 1-42. of pre 
Groth, Hilde. An experimental assessment in 
hension force as a measurement of effort jaser 
chomotor skills. Unpublished doctoral di 
tion, Univer. California, Los Angeles, 1957. e fric 
Groth, Hilde, & Lyman, J. Effects of surface ed 
tion on skilled performance with bare er 
hands. J. appl. Psychol., 1958, 42, 273-27" 


tests: 


ta- 


Manipulation with Gloves 


Hovland, C. I. Human learning and retention. In 
S. S. Stevens (Ed.), Handbook of experimental 
psychology. New York: Wiley, 1951. 

Lindquist, E. F. Design and analysis of experiments 
in psychology and education. Boston: Houghton 
Mifflin, 1953. 

McNemar, Q. Psychological statistics. 
Wiley, 1949. 

Teichner, W. H., Kobrick, J. Lẹ, & Dusek, E. R. 
Studies of manual dexterity: I. Methodological 


New York: 


161 


studies. Natick, Mass.: QM. Res. & Development 
Center, 1954. (Rep. No. EP-3.) 
Telford, C. W., & Swenson, W. E 
muscular tension during learning. 
chol., 1942, 30, 236-246. 
Walker, H. M„ & Lev, J. 
New York: Holt, 1953. 
Zweizig, J. R., & Lyman, J. The effect of laminar 
configurations of handcovering materials on mean 
prehension force. Los Angeles: Dept. of Engineer- 
ing, Univer. California, 1957. (Rep. No. 57-24.) 


Changes in 
J. exp. Psy- 


Statistical inference. 


al of Applied Psychology 
a8, No. 3, 1959 


AVA VALIDITY FOR TEXTILE WORKERS? 


PETER F. MERENDA anp 


Walter V. Clarke Associates, Inc. 

The Activity Vector Analysis (AVA) is a l 
self concept personality assessment instru- 
ment. It is widely used in industry in the 
classification and selection of personnel at all 
levels of employment. The details of the con- 
struction and application of the AVA have 
been published by Clarke (1956a; 1956b; 
1956c; 1956d). Reliability and validity stud- 
ies on this instrument have been reported 
by Bennett (1957), Clarke (1956c), Hammer 
(1958), Lundin (1957), Merenda (1958), 
Musiker (1958), and Whisler (1957). Most 
of this earlier research on the validity of 
AVA, however, has either been devoted to va- 
lidity in terms of personality description or to 
the problem of classification of personnel on 
the basis of AVA and criterion data derived 
from concurrent samples. The present study 
deals with the validity of AVA (over time) as 
a predictor of on-the-job performance and 
job success, within one company, of first line 
workers in the textile industry. 


Subjects 


Subjects were all (V = 142) first line work- 
ers mainly at semiskilled and unskilled levels, 
who possessed at least a sixth grade education 
and who were hired by a large southeastern 
textile concern over the 15-month period be- 
tween January 1, 1957, and March 31, 1958. 
Of the 142 Ss, 107 are males and 35 are fe- 
males. Although a variety of jobs is repre- 
e, the occupations cluster 
around the relatively low skill level and rou- 
tine operational tasks involved in the manu- 
facture of textile goods, The specific occupa- 
tions represented by this sample are slubber 
tender, doffer, spinner, yarn winder, card 
tender, humidifier man, creeler, oiler, trucker, 
and packer. 


1 The authors 


gratefully acknowledge the assistance 
and cooperation rendered by J. Vernon Wallace of 
the Bibb Manufacturing Company, Macon, Georgia, 
who supervised the collection of the data for this 
project. 


162 


WALTER V. CLARKE 


Criteria 


There were two criterion measures of a 
ployee success used in the study. The ar 
of these was a locally-prepared five-item rhe 
ing scale for measuring Job Proficiency. 1 
individual components on this scale are: ; 
Job Performance; 2, Attitude; 3. Cooper 
tion; 4. Learning Ability; 5. Attendance = 
Promptness. Each item is scaled qualitative y 
from “very poor” to “very good” with an 5 
companying numerical scale ranging from gs 
to 5.0. A composite score which is the a 
Weighted sum of these five components is n 
tained for the scale and yields a range of tO 
scale scores from 0.0-25.0, .oted to 

The Job Proficiency Scale was subjecte ae 
internal consistency analysis, The ag oe 
used was one developed recently by Sta? É 
(1957). It is a generalization of the va 
known K-R Formula #20 applied to ae 
dichotomously scored items and is algebri 
cally equivalent to Cronbach’s (1951) Coe äl 
cient Alpha and Hoyt’s (1941) analysis 
variance formulas. three 

Internal consistency coefficients for ta 
raters of this study using the instrumen 5 
various time intervals are reported in Tab ters 
The ratings were made by the criterion ra ays 
at the end of 30 days and again at 90 were 
after first employment for those who bei: 
still on the payroll on those dates. For the 
attrition group the ratings were made 0? 
severance date. s ine 

1t will be noted in Table 1 that the si* j 
ternal consistency coefficients reported e 
are all substantially high. Hence, the q the 
appears to be relatively homogeneous a eas 
use of the composite score as a criterion ™ 
ure is permissible, 


Raters this 

There were two categories of raters for om 

study. The predictor raters were the ee: 

Pany personnel manager who was a trä as” 
interpreter (analyst) of the AVA and the 


AVA Validity for Textile Workers 


sistant personnel manager (interviewer) who 
was not trained to use the AVA. The cri- 
terion raters were the supervising foremen, re- 
spectively, of the Ss of this study. 

The 142 Ss of this study were divided 
among six foremen. However, separate analy- 
ses were made for only three of the raters, 
since these three supervised the great ma- 
jority (80%) of the sample. The remainder 


4 
of the Ss was scattered among the other three 


foremen. Consequently, the m’s for these 
others were too small for practical purposes 
and, therefore, these latter were not analyzed 
Separately. The analyses based on the total 
group include all cases under all six foremen. 


Procedure 


qg einning in January 1957 and continuing for a 
TAA period, every new applicant for each of 
SALE listed previously in this article was hired 
Ead fom the basis of an interview by the “inter- 
NET of this study. The decision as to hire or 
y are was made completely without any reference 
rel iS AVA. Because of the employment situation 
the ive to the supply and demand of workers for 
tio jobs of the study, especially during peak produc- 
di n periods, it was necessary to hire persons who 
id not meet the criteria for employment established 
y the interviewer. 

Within a day or two after each of the Ss of this 

aay was hired, he was administered an AVA by 
he interviewer who originally hired him. The AVA 
ayet was given the completed AVA forms. He 
hen scored and interpreted the profiles of results. 
w had no personal contact with the S prior to the 

Ministration and scoring of the AVA. ; 
Boo the AVA analyst and interviewer predicted 

expected job proficiency of these Ss using the 
rating scale just described. The ratings were inde- 
Pendently made. The interviewer made his predic- 
one solely on the basis of information gathered 
hen the employment jnterview. The AVA ana- 
Yst made his predictions solely on the basis of in- 
Ormation revealed to him through his interpretation 
of the AVA, Hence, the predictions were actually 
made on the basis of “blind analyses” of AVA. 

The AVA analyst, in addition to his knowledge of 
AVA theory and application, also was well ac- 
uainted with the nature of the occupations for 
Which he was predicting and with the over-all com- 
pany philosophy regarding the treatment of workers. 

his additional knowledge, no doubt, lent consider- 
R le assistance in his interpretation of AVA profiles 
EA the purpose of predicting success or failure of the 

S of the study. 

Predictions were made by 
tuntormiance ratings for cach of t 
pr ately, however, the interviewer 

edictions until after the projec 


the AVA analyst of job 
he 142 Ss. Unfor- 
did not begin his 
t had been under 


163 


Table 1 


Coefficients of Equivalence for Three 
Independent Raters 


Time of Rating 


Criterion Rater 30 Days 90 Days 
Foreman I .87 (57) .78 (36) 
Foreman II 91 (38) 94 (29) 
Foreman III .87 (19) .85 (13) 


Note.—Figures in parentheses are 7's upon which each co- 
efficient is based. 


way for several months. Hence, interviewer predic- 
tions are available on only a portion of the sample. 

Comparisons were made, however, between the 49 
commonly rated personnel and the total sample with 
respect to sex distribution, age, educational level, and 
job performance ratings. No statistically significant 
differences were found. Hence, it may be concluded 
that the reduced sample on whom the interviewer 
made his predictions was representative of total 
sample. 

For those workers who had been on the job for 
30 days, their respective foremen were given the Job 
Proficiency Rating Scale and asked to rate them on 
each of the five items on the scale. For those work- 
ers, who for one reason or another did not survive 
the first month, the ratings were made on the last 
day of work with the company. The same process 
at the end of each 90-day period for 
all Ss surviving more than 30 days on the job. 
Again, as previously, ratings were made on the last 
day of employment for those workers leaving prior 
to the expiration of 90 days on the job. 

Comparisons were made between the AVA ana- 
lyst’s and interviewer's predicted ratings with those 


made by the criterion raters. 


was repeated 


Results 


Product moment correlations were calcu- 
lated between predicted and actual rating 
scores for both the AVA analyst and inter- 
viewer. These statistics are presented in 
Table 2. Inspection of the data of this table 
reveals that for the pooled groups of all raters 
substantial correlations were found to exist 
between the predicted and criterion scores for 
both the AVA analyst and the interviewer. 
The over-all validity coefficients were found 
to be about equal for both the 30-day pre- 
dictions (7 = -50 for analyst, 7 = .41 for in- 
terviewer), and the 90-day predictions (7 = 
.58 for analyst, r = .33 for interviewer). The 
differences in the validity coefficients were not 


statistically significant. 


164 Peter F. Merenda and Walter V. Clarke 
Table 2 
Validity Coefficients for AVA Analyst and Interviewer at End of 30-Day and 90-Day Periods 
Criterion Rater 
Predictor per 
Rater I Il Il All Intervi 
30-Day AVA Analyst -48 (57) .24* (38) -13* (19) -50 (142) 
Ratings Interviewer 54 (22) -47* (16) ae 41 (49) 33 (49) 
90-Day AVA Analyst 59 (36) 59 (29) 09 (13) 58 (85) 
Ratings Interviewer -66 (18) -25* (16) w .33 (37) 
30- vs. 90-Day —_ 
Ratings -67 (36) 88 (29) 03 (13) 58 (85) 
i L ee 
* Not significant. 
** Insufficient n. 1 . 
Note.—Figures in parentheses are x's upon which each r is based. 
The Critical Ratio for the difference in 7’s Discussion 


between the analyst’s and the interviewer’s 
predictions was 0.66 for the 30-day ratings 
and 1.57 for the 90-day ratings, 

The data seem to indicate that the blind 
use of AVA and the interview techniques both 
show substantial validity in terms of the cri- 
teria of this study, 

Correlations between Predicted scores and 
criterion ratings by the individual raters, 
both 30- and 90- 
day predictions, and Rater II on 90-day pre- 
fferent from zero. 
to reach the ac- 


tvals. This find- 
ing suggests relatively low reliability of this 


rater’s judgments and conceivably is another 
factor attenuating the validity coefficient for 
both predictor raters, On the other hand 
Rater I (r= .67) and Rater TI (r= 88) 
were considerably more Consistent in their 
two ratings. For all Raters the Consistency 
of ratings made at 30 days and at 90 days 
was also relatively high (r = 58). The AVA 
analyst and interviewer agreed only to a 
moderate degree (r= 33) with respect to 
common ratings assigned to 49 Ss. 


The correlation coefficients between a 
dicted ratings and criterion scores of the al e 
bined groups on the job proficiency er 
Proved in this study to be positive and ters: 
tistically significant for both predictor He o 
Those based upon a blind interpretation 
AVA seem to be somewhat higher at the Pa 
of 90 days than those based upon pers ot 
interview. However, the difference prove 
to be statistically significant. -ge VA 

These findings attest to the peer na 
lidity of both procedures for the specific prob- 
pations studied and are suggestive of I the 
able increased predictive efficiency ° view: 
combined use of AVA and personal inter use 
In the operational setting, the AVA nol the 
as an adjunct to, and not a substitute fo sat 
Personal interview. The fact that these nsid- 
correlations for individual raters show Cater $ 
erable variation from the combined nit in 
correlations is probably due to niece ul 
Sample sizes and possibly the gre 
reliability of the judgments of one er this 
rater. Overall, however, the results ° a 
study tend to show that equally nee one 
Possibly better long-range predictions p the 
the-job success can be made throug jntet™ 
skilled use of AVA as through personal 
view procedures, the 1e 

The findings of this study confirm * ses 
sults of other recently completed 
(Lundin, 1957; 


A vine 


invo 
Bennett, 1957) inv 


AVA Validity for Textile Workers 


blind analysis of AVA. These earlier studies 
Were concerned with the validity of AVA in 
describing personality. The present study 
has investigated and reported on the problem 
of how the temperament characteristics, as 
Measured by AVA, are associated with on-the- 
Job success of workers in certain occupational 
areas. 


Summary and Conclusions 


_ Blind predictions as to the probable on-the- 
Job success of applicants for routine machine 
Operational as well as other semiskilled and 
unskilled jobs in a large textile concern were 
Made solely on the basis of AVA profiles. 
hese predictions were in the form of nu- 
Merical ratings on a job proficiency scale. 
n Subjects were 142 new hires for various first 
line worker jobs over a period of 15 months. 
Several foremen, supervisors of the Ss, were 
asked to rate them on the job proficiency 
Scale 30 days and 90 days after employment. 
Comparisons of the AVA analyst’s predictions 
and raters’ judgments were then made. Pre- 
“ictions made by the interviewer who actu- 
Fd did the hiring were also compared with 
€ criterion ratings on a portion of the total 
Sample, 

Internal consistencies of the rating scale 
ranged from .78 to .94 with a median of .87. 
Product moment correlations between ana- 
lyst’s predictions and the criterion scores of 
the combined groups of raters were .50 for 
30 days and .58 for 90 days. For the inter- 
viewer the r’s were .41 for 30 days and 33 
for 90 days. 

On the basis of these findings it may be 
Concluded that both the interview techniques 
and the AVA, when employed by a trained 
interpreter, are valid predictors of job success 
Or the occupations studied. The data of the 


165 


study also suggest that the predictive effi- 
ciency may be enhanced by combining these 
two procedures in the selection of textile 
workers performing routine operational tasks. 


Received July 7, 1958. 


References 


Bennett, J., Jr, Musiker, H. R., & Clarke, W. V. 
Activity Vector Analysis vs. clinical appraisal in 
personality description. Unpublished manuscript. 
Providence, R. I.: Walter V. Clarke Ass., 1957. 

Clarke, W. V. Personality profiles of loan office 
managers. J. Psychol., 1956, 41, 405-412. (a) 

Clarke, W. V. Personality profiles of self-made com- 
pany presidents. J. Psychol., 1956, 41, 413-418. 
(b) 

Clarke, W. V. The construction of an industrial se- 
lection personality test. J. Psychol., 1956, 41, 379- 
394. (c) 

Clarke, W. V. The personality profiles of life insur- 
ance agents. J. Psychol., 1956, 42, 295-302. (d) 
Cronbach, L. J. Coefficient alpha and the internal 
structure of tests. Psychometrika, 1951, 16, 297- 

334. 

Hammer, C. H. A validation study of the Activity 
Vector Analysis. Unpublished dissertation, Purdue 
Univer., 1958. 

Hoyt, C. J. Test reliability estimated by analysis of 
variance. Psychometrika, 1941, 6, 153-160. 

Lundin, W. H. A clinical evaluation of the Activity 
Vector Analysis Test. Unpublished manuscript. 
Chicago, Illinois: T. W. Franks and Associates, 


1957. 

Merenda, P. F. & Clarke, W. V. 
dictor of occupational hierarchy. 
chol., 1958, 42, 289-292. 

Musiker, H. R., & Clarke, W. V. 
reliability of Activity Vector Analysis. 
Rep., 1958, 4, 435-438. 

Stanley, J. C. K-R 20 as a stepped-up mean fr 
among items. Paper read at annual convention of 
the National Council on Measurements Used in 
Education. Atlantic City, New Jersey, Feb., 1957. 


(Mimeo.) 
Whisler, L. W. f 
of Activity Vector Analysis. 


205-223. 


AVA as a pre- 
J. appl. Psy- 


The descriptive 
Psychol. 


A study of the descriptive validity 
J. Psychol., 1957, 43, 


Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


AN INVESTIGATION OF SOME ASPECTS OF THE 
SOCIAL PSYCHOLOGICAL IMPACT OF 
AN EDUCATIONAL TELEVISION 
PROGRAM? 


JAMES J. ASHER? ann RICHARD I. EVANS 


University of Houston 


Since the first noncommercial educational 
television station, KUHT-TV, began opera- 
tions in April, 1953, a variety of studies in- 
volving the educational television medium were 
completed. Typical reports of such studies, 
which investigate such problems as the rela- 
tive effectiveness of television vs. formal 
course instruction, or the demographic and 
personality characteristics of the educational 
television audience, are reported elsewhere 
(Adams, 1956; Carpenter, 1955; Evans: 
1955; 1956; 1957; Evans, Roney, & Mc- 
Adams, 1955; Husband, 1954; Kumata, 1956; 
Merrill, 1956; Rock, Duva, & Murray, 1957). 

Another provocative direction for research 
might be the examination of the social psy- 
chological impact of general educational tele- 
vision programming as is typified by the pro- 
ductions released to the various educational 
television stations by the Educational Tele- 
vision and Radio Center of Ann Arbor, Michi- 
gan. Such an investigation is reported in the 
present paper, 

Inherent in such an investigation are sey- 


eral interesting questions, Three are dealt 
with in the present report: 


1. To what degree are a 
changed by a given educati 
gram? An over-all stat 
made in relation to the i 


ttitudes of viewers 
onal television pro- 
ement that might be 
mpact on attitudes of 


1In part based on data gathere 
with a dissertation Presented by th 


the present paper in fulfillment 


puter Center at the University of Houston. 
2 Now a member of the Psychology department at 
San Jose State College, San Jose, California, 


mass media messages in general is that er, 
tude shifts are, indeed, observable. Schram a 
(1949) makes this relevant observat pi 
“More than one hundred papers have a 
Presented quantitative evidence that < 
change occurs, and that it can occur Ne e 
result of messages translated by any T in- 
mass media or combination of mass an 
terpersonal communication.” sures 
2. To what degree can personality ea e 
predict the direction and intensity of atti 
shifts? Here Hovland and Weiss iit 
Summarize the results of investigations suc 
ing with this problem and suggest a are 
Predictions of change are reported, bu 
generally topic-bound in nature. 
words, only “specific-to-content” rathe e ap- 
generalized predictions of attitude chango 
pear to have been demonstrated. A P iiy 
ality measure of more general applica 
appears to be called for here. 
3. Would the credibility of the s0 ye al 
an educational television program ait 
effect on the degree of intensity of @ atti- 
change? For example, would a greate ieved 
tude change occur if the audience be ional 
that it was being presented on a m edu- 
commercial network rather than a loca pi- 
cational television station? In genera’, or 
cal investigations of this problem as so jn 
elsewhere (Janis, 1954; Wegrocki, 19 to be 
dicate that a suggestion is more likely es: 
accepted if it is associated with a hig 
tige source. 


a 


urce of 


ues” 
To give operational expression to thë is in 
tions raised above, an attempt was a of 
the present study to evaluate the mpa en 
a specific educational television prose pali- 
titled Puberty in Girls, one of the typi? op! 
hour programs in the series entitled thé 
Are Taught To Be Different, produced 


166 


Impact of Educational T elevision 


F eeariakd of Houston for the Educational 
elevision and Radio Center for national dis- 
ep This film compared the reactions 
i three cultures with respect to the psycho- 
ogical onset of puberty in girls. A Negro 
Professor narrates the message while Negro 
college students illustrate the context with 
modernistic or symbolic dancing and music. 
G The following hypotheses were formulated 
a eye the questions raised above as they 
ete be dealt with in the present investiga- 


= Attitudes, as measured by the Osgood 
— a Differential (Evaluative Dimen- 
" n), toward a group of 11 typical con- 
epts represented in the program (“Negro 
Professor,” “KUHT-TV Channel 8,” “female 
Monthly cycle? “CBS-TV Channel 11,” 
Negro,” “public discussion of sex,” “Texas 
outhern University,” “public discussion of 
ed in girls,” “intelligent Negroes,” “use 
z% reig as a teaching aid,” “use of un- 
in al music as a teaching aid”), would change 
ä a significantly more favorable direction as 
4 result of viewing the program. 
ee dogmatic individuals, as indi- 
Seal by scores on the Rokeach Dogmatism 
foe would resist changes in attitudes, as 
easured by the Osgood Scale, to a signifi- 
$ ntly greater extent than individuals who 
ere relatively less dogmatic. 
S 3. The program when presented from a 
ource of communication with allegedly high 
prestige, a major television network (the Co- 
oha Broadcasting System), would produce 
we oificantly greater attitude change than 
dle it is presented from a source of alleg- 
é “id lower prestige, a local noncommercial 

Ucational television station, KUHT-TV. 


Methodology 


ne ubiects, all undergraduate stu f 
the © classes enrolled in elementary psychology a 
in University of Houston, not differing significantly 
ae ital over-all composition. The control group 
Scribe 30) did not view the television program a 
desi ed earlier. Experimental Group (n= 41), 
o nated as the high prestige group, was allowed 
ia eis that a major television network (Colum- 
Sram. roadcasting System) was presenting the pro- 
Nated while Experimental Group II (# = 36), desig- 
eli as the low prestige group, was allowed to 

leve that the program originated from KUHT-TV, 


dents, consisted of 


167 


the University’s noncommercial educational televi- 
sion station. 

The measurement of attitudes in the present study, 
as suggested above, was effected through the use of 
the Semantic Differential, Evaluative Dimension, of 
Osgood (1952), since as a seven-point generalized 
attitude scale, now widely used, it appeared to lend 
itself well to the evaluation of the impact of a mass 
medium message involving an array of concepts such 
as is involved in the one-half hour television pro- 
gram used in the present study. 

The search for a personality measure that would 
be less “topic-bound” as it potentially relates to atti- 
tude change, lead the authors of the present paper, 
as suggested above, to the Dogmatism (D) Scale of 
Rokeach (1954). Rokeach and Fruchter (1956) 
demonstrated by factor analysis that D measures the 
rigidity, authoritarianism, and anxiety of the F and 
E scales; yet dogmatism has the advantage of being 
independent of belief content. Therefore, no matter 
what the content of a communication might be, we 
would theoretically expect that personality factors 
reflected in D-scale scores might be involved in a 
tendency toward opinion or attitude change. 

One of the experimenters administered the Semantic 
Differential and Dogmatism Scales to the control and 
two experimental groups of Ss in a pretest, pre- 
sented the television program to the two experi- 
mental groups, and administered the posttest of the 
Osgood and Rokeach scales to the three groups. Ex- 
posure to the program was effected in a specially de- 
signed closed-circuit viewing room for both experi- 
mental groups. Evidence gained from responses of 
Ss confirmed the success of the staging of an os- 
tensibly genuine telecast in each instance. In sta- 
tistically analyzing the results in the present study, 
the nonparametric techniques, chi square and rank 
order correlation, appeared to be applicable. 


Results and Discussion 


With respect to the first hypothesis shifts 
in only three of 11 instances proved to be 
statistically significant. These were toward 
the following concepts: (a) “female monthly 
cycle,” which was significant in the high pres- 
tige and low prestige groups at the .10 and 
02 levels (chi squares of 2.8 and 5.5, respec- 
tively, with 1 degree of freedom); (b) “Ne- 
gro,” which was significant at the 01 and .05 
Jevels (chi squares of 7.1 and 3.5, respec- 
tively, with 1 degree of freedom); (c) “pub- 
lic discussion of puberty in girls,” which was 
significant at the .05 and .01 levels (chi 
squares of 3.8 and 16.0, respectively, with 1 
degree of freedom). 

No significant changes with respect to any 
of the eleven concepts appeared in the con- 


trol group. 


168 


The fact that a shift in the attitude toward 
the concept, “Negro,” was found suggests 
that in such instances where the original atti- 
tude was probably highly crystallized through 
previous experience, as is the case in the 
South with respect to attitudes toward “Ne- 
gro,” this type of “random target” television 
program may have at least an immediate 
effect. However, impacts involving newer as- 
sociations such as the concepts, “Negro pro- 
fessor” or “dancing as a teaching aid,” may 
be difficult to effect through any single edu- 
cational television program. 

With respect to the second hypothesis, rank 
order correlations were calculated between 
Dogmatism Scale scores and difference scores 
(pre-attitude test minus post-attitude test) 
regardless of sign in both experimental groups. 

Only two correlations in 22 were significant 
within the .05 level of confidence, Only one 
of the two was in the predicted direction. 
Such findings, of course, could easily have oc- 
curred by chance. Thus the basic hypothesis 
must be rejected. 

However, another possibility would be that 
the greater the dogmatism, the more attitudes 
towards the source of the message will tend 
to determine attitude toward the content of 
the message. A correlational analysis of this 
possibility revealed no statistically acceptable 
support of it. 


Still another possibility might þe that 
highly dogmatic Ss will tend to be more ex- 
treme in the intensity of their own attitudes 
than are low dogmatics. None of the chi 


Squares computed to examine this possibility 
supported it, 


However, the D sc 
attitude change if t 


puberty 
he pro- 
directly 
£0-involving 
character to the respondent, even though 
highly controversial content Was present in a 
more subtle sense. Therefore, in future stud- 
ies utilizing the D scale as a possible predic- 
tive tool, messages might be selected that are 
more overtly ego-involving. This might sup- 


James J. Asher and Richard I. Evans 


ply the basis for a more sensitive evaluation 
of the D scale as a predictive measure. 

For the third hypothesis, the chi-square 
test was used to test the significance of t 
differences between the high and low —_ 
groups on positive attitude shifts Lay 
subtracted from the posttest) as measured E 
the Semantic Differential. None of the he: 
approached significance at even the .10 le 
of confidence. Its, a 

In analyzing this portion of our results, 
consideration of the adaptation level ( a 
theory of Helson (1951) might prove Re 
vocative. Did the high prestige aes ee: 
example, expect a typical commercial a 
work program, but instead were given a ls 
duction that deviated sharply from ua ex- 
ticipation? Perhaps the distance bese Sipi 
pectancy and actuality, which was Pro Seip 
greater for the high prestige group, Pi ii 
tated a negative “set” toward the prog it. 
Which in turn influenced attitudes towa! 
Some indirect evidence reported by 
(1957) suggests this possibility. 
shown that the low prestige group had Sat 0 
higher correlations and a greater num 
significant correlations between their tti 
toward the program as a whole and @ 
towards concepts within the program: 
this group liked the program as a who i 
than did the audience who expected tig 
Program. This suggests that the amend 
the communication source has by PO yasive 
a simple, direct relationship to the pers" 
ness of the communication. ageste 

Another possibility that may be a of 
here is that the very fact that the imp ove! 
the same program presented ostensiD # Jeast 
an educational television station was ensi! 
as great as it was when presented n y otk 
over a major commercial television af aent 
may be regarded with great ercon isi” 
by participants in the educational ae com 
movement. It is possible that the $ 3 a 
scious belief of many participants FiO 
tional television activities that the f tele” 
lack of relative prestige of education ain” 
vision stations as a source of POR a, ma 
diminishes the impact of programm! “in a 
be erroneous. In fact, as suggest 


was 


t 
tightly 


e 
ans 


Impact of Educational Television 


earlier study by Evans (1957), the concep- 
tion that educational television lacks prestige 
in the eyes of the viewers may be more im- 
agined than real. 


Received July 22, 1958. 


References 


7 J.S. An exploratory study of viewers and 
es of educational television. Chapel Hill: 
1986. Res. soc. Sci, Univer. of North Carolina, 

Asher, J. J. An investigation of a group of social 
POvehological factors related to the impact of an 
fou aoe television program. Unpublished doc- 
ie al dissertation, Univer. of Houston, 1957. 
a C. R. Psychological research using tele- 

Biya, ion. Amer, Psychologist, 1955, 10, 606-610. 
ans, R. I. The planning and implementation ofa 
pchology series on a non-commercial educational 
elevision station. Amer. Psychologist, 1955, 10, 

E 602-605, ° 
oe R. I. An examination of students’ attitudes 

Ward television as a medium of instruction in a 
ee course, J. appl. Psychol, 1936, 40, 

Evans, R. I. An analysis of some demographic and 
Psychological characteristics of an educational tele- 
sf ie audience, Ann Arbor: Educational Tele- 

Evine? and Radio Center, 1957. 
aai R. L, Roney, H. B. & McAdams, W. J. An 
valuation of the effectiveness of instruction and 
mitine reaction to programming on an educa- 
aot television station. J. appl. Psychol., 1955, 
9, 277-279, 


clson, H, Perception. In H. Helson (Ed.), Theo- 


169 


retical foundations of psychology. New York: Van 
Nostrand, 1951. Pp. 348-385. 

Hovland, C., Lumsdaine, A. A., & Sheffield, F. D. 
Experiments in mass communication, Princeton: 
Princeton Univer. Press, 1949. g 

Hovland, E. I., & Weiss, W. The influence of source 
credibility on communication effectiveness. Publ. 
opin. Quart., 1951, 15, 635-650. 

Husband, R. W. Television versus classroom for 
learning general psychology. Amer. Psychologist, 
1954, 9, 181-183. 

Janis, I. L. Personality correlates of susceptibility 
to persuasion. J. Pers., 1954, 22, 504-518. 

Kumata, H. An inventory of instructional television 
research, Ann Arbor: Educational Television and 
Radio Center, 1956. 

Merrill, I. R. Benchmark television-radio study, 
Part I: Lansing. WKAR-TV Res. Rep. East 
Lansing: Michigan State Univer., 1956. (Rep. No. 
561M.) 

Osgood, C. E. The measurement of meaning. Psy- 
chol. Bull., 1952, 49, 197-237. 

Rock, R. T., Jr. Duva, J. S, & Murray, J. E. 
Training by television: The comparative efective- 
ness of instruction by television, television record- 
ings and conventional classroom procedures. Port 
Washington, L. I, N. Y.: Special Devices Cent., 
1957. (SDC REP. 476-02-2 [NAVEXOS P-850-2].) 

Rokeach, M. The nature and meaning of domatism. 
Psychol. Rev., 1954, 61, 194-204. 

Rokeach, M., & Fruchter, B. A. Factoral study of 
dogmatism and related concepts. J. abnorm. soc. 
Psychol., 1956, 53, 356-360. 

Schramm, W. The effects of mass communication. 
A review. Journalism Quart., 1949, 26, 307-409. 
Wegrocki, H. J. The effect of prestige suggestibility 

on emotional attitude. J. sec. Psychol, 1934, 5, 


384-394. 


Journal oj Applied Psychology 
Vol. 43, No. 3, 1959 


THE PSYCHOLOGIST AS AN INSTRUMENT OF 
PREDICTION 


ARNE TRANKELL 


Institute of Education, St 


In the Journal of Abnormal and Social Psy- 
chology Robert R. Holt (1958) has drawn 
attention to a technically important question 
which ought to be taken into consideration in 
any particular predictive enterprise. He sum- 
marized his discussion as follows: “When 
clinical methods are given a chance—when 
skilled clinicians use methods with which they 
are familiar, predicting a performance about 
which they know something—and especially 
when the clinician has a rich body of data 
and has made the fullest use of the system- 
atic procedures developed by actuarial work- 
ers, including a prior stud bearing of 
the n perform- 
prediction 
sses.” 
combina- 
on stand- 
ach in which 
nity to evalu- 
ychologist can 


an Airlines Sys- 
mination of this 


ears has now been 
d in this article. I 
illustration to what 


Robert R. Holt has wanted to Prove. 


The Selection Task 
Since 1951 the applicants for 
courses in Scandinavian 
been examined by mean 
lection worked out by t 


co-pilot 
Airlines System have 
s of a system of se- 
he author. This sys- 


ockholm University, Sweden 


P irline 
tem is based on a job analysis of the air 
pilots. : wing 
On the basis of this analysis the es 
list of assessment variables was made UP 


. Maturity 

. Self-reliance 
Authority 

Tactfulness 
Independence 

. Social Adjustment , 

- Sensitivity to Criticism 
- Panic Resistance 

- Motor Skill 

- Verbal Intelligence 

- Inductive Intelligence 
- Technical Intelligence 
- Ability to Organize , 
14. Simultaneous Capacity 


SLE SCmrIANEONS 


A ables 

For the assessment of some of the V@™ ed. 
a battery of standardized tests wea 10° 
Other variables were assessed in spr nations 
cedures applied in individual exami sist 
Among these were Motor Skill, panio si by 
ance and Simultaneous Capacity i? autho" 
means of a technique devised by E): F 
and described by Langewiesche (19 . one 

The individual examinations were ie pine 
by two or three psychologists who ee, F 
each applicant independently. This Jem 
ment had a correcting as well as os 
ing effect. The examinations led uF 1 
in which the psychologists describe’ 
plicant’s characteristics as regards z 5 
ment variables. In addition to this t itud? A 
the variables on a stanine scale. AP essed i 
co-pilot and as captain were also 45 
the stanine scale. ch 

The final decision as regards nail 
cant was obtained at a meeting degre 
Psychologists who had examined pe 
cant. At these meetings the reports ¥ 
against each other. On points W se 
ions differed the applicant was discu 


170 


Psychologist as Instrument of Prediction 171 


Table 1 


Correlations and £ Values for the Tests 


Test Variable Rii Difference t N 
Simultaneous Capacity 0.42 + .09 1.67 + 36 4.64 363 
Inductive Intelligence 0.33 + .09 1.06 + .33 3.21 363 
Verbal Intelligence 0.28 + .13 0.94 + 42 2.24 166 
Mechanical Comprehension 0.21 + .09 0.73 + .32 2.28 363 

—0.07 + .10 —0.25 + .34 0.74 363 


Sensitivity 


Nx +Ny —2 


aie was reached. Finally the ratings 
pene assessment variables were read, and an 
them Lei was arrived at for each of 
ap lic he company was not advised to hire 
Fi icants with aptitude scores lower than 

eon the stanine scale. 
K Surig the years 1951-56 a total of 780 
oe. were examined. Altogether 363 
Pplicants were assigned to co-pilot courses. 
20 af or subsequent to their training period, 
abilit the assignees were dismissed due to in- 
Bisa. to fly in SAS. The dismissals were 
as on the results obtained in the co-pilot 
a and on the opinions of the instructors 
Eh checkpilots of SAS. Tke aptitude assess- 
rents were in no case known to instructors 
A checkpilots, who had to base their opin- 
ons on their own observations of each pilot 
aspirant during and after the course. 

The validity of the selection system is ex- 
amined by comparisons between the remain- 


ing and the dismissed pilots. 


The Test Variables 


_ Standardized paper and pen tests 
ìn measuring the following traits: 


were used 


le Intelligence 
erbal Intelligence 
Mechanical Comprehension 
Sensitivity (personal inventory) 
Simultaneous Capacity (cancellation test) 
rcs 
a One of the requirements to be fulfilled in order to 
: pecepted for the psychological examinations was 
ea least 350 hours’ flying experience- ae 
plies that all testees had already proved themselves 
e to fly an airplane. 


s -2X -X)? +2(¥ -W, 


The bi-serial correlations between test scores 
and the criterion remaining-dismissed are 
shown in Table 1. The differences between 
the two groups and the ¢ values are also 
given. The results do not differ from those 
usually obtained for tests of this kind. 

Besides the standardized tests the battery 
contained a number of tasks designed to form 
the basis for individual examinations. Among 
those were an autobiographical essay and a 
questionnaire regarding childhood, adoles- 
cence, school, family life, spare time activi- 
ties, etc. Another questionnaire, Flying An- 
amnesis, contained questions about the appli- 
cant’s career as a pilot in the Air Force (or 
elsewhere), flying hours, types of aircraft 
flown, ranking in various schools, details con- 
cerning first interest in flying, emergencies, 
and incidents. 

Finally the test battery contained a com- 
organizational problem, where the in- 
terplay of a number of factors had to be kept 
in mind when working out a schedule. The 

ly high level of general in- 


test demands a fair 
telligence, as well as perseverance and self- 
reliance. It allows a great number of solu- 


tions. No standardized norms were applied 
in evaluating this test, which was regularly 
used as a subject for discussions in the indi- 


vidual examinations. 


plicated 


The Assessment Variables 
The standardized tests could not provide 
material for an adequate assessment of all 
variables to be considered. All assessment 
variables were, however, considered in the in- 
dividual examinations. 


172 


In assessing the variables representing in- 
tellectual capacities the results of the stand- 
ardized tests were looked upon as hypotheses 
to be further examined. Extremely low test 
results were regularly looked upon with sus- 
picion and checked by individual discussions. 
The corrections were determined by the ex- 
aminer’s personal judgment and interpreta- 
tion. The applicant’s ability to use his ca- 
pacities in practical life situations was thus 
under consideration before the ratings were 
looked upon as definite. 

For the assessment of Motor Skill, Panic 
Resistance and Simultaneous Capacity a pro- 
cedure called Tapping was used. On a paper 
were printed two patterns, consisting of small 
circles connected by straight lines. One pat- 
tern was for the left, one for the right hand. 
The applicant was given a pencil in each hand 
and was told to place the point of the pencil 
in one of the small circles in either pattern. 
The pencils then had to be moved from circle 
to circle following the lines. The applicant 
moved alternately the right and the left hand 
and the speed was determined by the examiner 
beating the time. The degree of difficulty 
could easily be adapted to the coordination 
level of the applicant. By increasing the 
speed the difficulty could be greatly increased 
and cause heavy stress on the applicant, 
Sudden unexpected difficulties were also in- 
troduced in order to create panic reactions, 


Arne Trankell 


Together with findings such as lowered self- 
confidence, these signs were used in assessing 
the Panic Resistance variable. Also to be 
taken into consideration in this connection 
was the applicant’s descriptions of his ar 
reactions in emergencies or critical situato 
which he had experienced. Motor Skill pe 
Simultaneous Capacity were assessed by a 
servations of the tapping behavior in its poi 
form and when complicated by intellecta 
problems to be solved simultaneously it 
the manual task. No quantitative we 
were used and no quantitative scoring V 
done except for the stanine ratings. aeit 
In Table 2 the evaluations of the assess oe 
variables are shown. All the correlations ails 
significant except for the typical captain m 
Maturity and Tact. The variables are pi 
ranged in accordance with the size of the eat 
serial correlations, Those variables ba 
highest on the list which have been assur 
as most important for a co-pilot. 


ts 
Improvement of the Validity of the Tes 


ais : which 
Of special interest are those variables 


were assessed by using the results of the aes 
ardized tests as starting points. Table s fo 
out side by side the bi-serial correlate et 
these tests and the corresponding ane 
for the assessments, In order to show t° psy 
extent the tests have been utilized by the 


Table 2 
Correlations and £ Values for the Assessments ee 
Assessment Variable Roi Difference t N 
Simultaneous Capacity 0554.08 0984.16. G13 e 
Panic Resistance 0.47 + .08 1.08 +: .21 5.14 
Self Reliance 0.44 £ 08 1.10 £ 23 4.77 oo 
Motor Skill 0.43 £ 09 0.84 = 18 4.72 a6 
Social Adjustment 0.40 + 09 0.79 + 18 4.390 363 
Inductive Intelligence 0.40 + .09 1.19 + 27 4.35 363 
Ability to Organize 0.36 + 09 0.93 + .24 3.92 po 
Verbal Intelligence 0.32 + 09 0.82 + 24 3.45 16 
Mechanical Comprehension 0.30 + .09 0.88 + .27 3.26 ie 
Independence 0.30 + .09 0.64 + .20 3.26 a 
Authority 0.28 4. 09 0.50 & 17 3.02 a 
Maturity 0.11 + 09 0.23 & .19 1.21 A 
Tact 0.03 + 10 0.05 & .17 0.29 a 
Sensitivity —0.21 + 09 —0.49 + 18 2.79 


Psychologist as Instrument of Prediction 173 
Table 3 
= Comparison Between Tests and Assessments 
Correlations with 
Tekkan Criterion t Values A 
a Assessed Assessm. Test. Assessm. Test. Variation 
Simultaneous Capaci 55 
a apacity 0.55 0.42 6.13 4.64 259 
arp Intelligence 0.40 0.33 4.35 3.21 4 
ine Intelligence 0.32 0.28 3.45 2.24 83% 
a Comprehension 0.30 0.21 3.26 2.28 88% 
nsitivity —0.21 —0.07 2.79 0.74 11% 


en that part of the variation of the 
Y The -i variable which can be explained 
A E of the test variable has been 
expected in the third column. As could be 
in the of the tests prove to be least utilized 
only sng where the assessment variable is 
ral rtly covered by the test variable, ie., 
k Capacity and Sensitivity. 
for the rding to these results, the method used 
appien toe a of certain traits of the pilot 
were a s have given more valid results than 
tests as hieved by means of the standardized 
tat tr such. Tt has, however, to be observed 
course ere (the ability to pass certain 
‘Tame or pilot aspirants jn SAS) is not a 
can t ie of the trait itself. The assessment 
Sig a e into account that sort of simultane- 
hihi inductive, verbal, or mechanical 
7 igence which is of special importance for 
thu pilot, something the test cannot do. It 
ni seems possible to improve the predictive 
led of the measurement by modifying 
ane Tesults of standardized tests in accord- 
ex © with the findings and interpretations of 
Xaminers who are well acquainted with the 
ind of job for which the selection is made. 


The Efficiency of the System 


i efficiency of the selection system is 

‘Casured by the correlation between the co- 

Pilot stanines and the criterion remaining-dis- 

asa * The bi-serial correlation amounts to 
2 


The correlation between the captain stanines and 


the criter É 
hoy terion amounts to 0.51 + .08. This coefficient, 
ever, is of less interest, since the criterion refers 


The partial correlation 


e efficiency as co-pilot. | cor 
e and the criterion with 


o 
bet 
Ween the captain variabl 


0.75 + .07. The difference between the means 
of the two groups is 1.76 = .20, which yields 
t = 8.80. 

In the recommendations to SAS the apti- 
tudes as co-pilot and captain were summa- 
rized into four categories of suitability for 


employment. The categories were as follows: 


(neither stanine less than 5, 
total at least 14) 

(neither stanine less than 5, 
total 10-13) 

(one stanine 4 against the 
other 5 or more, one stanine 
3 against the other 7 or 


1. Particularly suitable 
2, Suitable 


3. Doubtful 


more) 


4. Unsuitable (remaining combinations) 


SAS was in no case obliged to follow the 
recommendations. In 1951 the psychologists 
were in fact disregarded fairly often, above all 
in cases where the company had received in- 
formation from former employers which did 
not coincide with the opinion of the psycholo- 
gists. The experience obtained from such 
cases resulted in the psychological assessment 
later being more relied upon. In Table 4 the 
number of dismissals within the various cate- 
gories of suitability is shown. 

In Table 5 the number of dismissals within 
the different pilot-stanine categories is shown. 
The extreme classes have been pooled together 
into 9-8 and 3-1, because of the small num- 


ber of individuals in each class. 


the pilot variable held constant has also been calcu- 
lated. It amounts to —.09. The correlation between 
iable and the co-pilot criterion is thus 


the captain vari 
wholly due to the pilot qualities required of a good 


captain. 


174 
Table 4 
Dismissal Rate in Various Categories of 
Suitability for Employment 
Dismissals 
in Per- 
Category Employed Dismissed centage 
Particularly 
Suitable 49 — = 
Suitable 218 8 3.7% 
Doubtful 59 4 6.8% 
Unsuitable 37 17 45.9% 
Total 363 29 8.0% 


Discussion 

Ordinary selection systems are based on the 
predictive capacity of a test battery. Each 
test is correlated with criteria obtained in ear- 
lier investigations, and the correlations serve 
as a basis for a multiple regression analysis. 
Thus a system of weights is determined for 
the calculation of the final stanine scores. 
The procedure is altogether Statistical. The 
advantages of such a system consist above all 
in its capacity. It makes it Possible to ex- 
amine and assess the aptitude of a large num- 
ber of applicants within a relatively short 
time. 

In contradistinction to this system the se- 
lection work for SAS has the characteristics 
of a craftsman’s job. Itis certainly more ex- 
pensive than the pure test system, since quali- 
fied psychologists have to examine each ap- 
plicant. It is true that standardized tests are 
used in the SAS system as well, but these 
tests serve as tools for an individual assess- 
ment, the result of which depends on the ex- 
aminer’s capacity to make use of the tools at 
his disposal. Instead of a system of fixed co- 
efficients its technique of dynamic interpreta- 
tion works with what could be designated as 
flexible weights. This means, for instance, 
that the dynamic interpretation is able to 
avoid overestimation, when a man with high 
scores in the tests has proved himself unable 
to use his capacities in practical life situations. 

There is one drawback to a system based 
on dynamic interpretations: it depends on the 


Arne Trankell 


skill of the psychologists who do the work. 
The selection of examiners is therefore a prob- 
lem of the same importance as the selection a 
tests in a mechanical prediction system. N 
only the selection, but also the training of K 
examiners is a laborious task. Parallel = 
aminations of the same examinees must be 
routine procedure to control the stanan E 
used, and individuals employed must be a 
lowed up continuously in order to mainta 
and improve the validity of the predict a 
Experiments often indicate that the in 
viewer (or the interview!) cannot increase 
validity of predictions based on tests. a 
significance of this is doubtful. It sane 
be true, however, that the interview is ihe 
effective when the information used er by 
interviewer is confined to what is suppl an 
the interview itself. But this seems to ince 
almost foolish way to use an examiner; 7 h 
it deprives him of the natural tools vai 
Psychologist, e.g., the results of standi ex 
tests and the techniques of individua 
aminations. the 
The selection system worked out pe - 
author is meant to be a synthesis of r 
tistical and a clinical approach. Ro in 
chologist has been given the leading e an 
this synthesis, since he knows the limi zic Í 
efficiency. of the tools. There is no nee is 
the fact that psychologists, when give" e- 
leading role, can be more effective 25 s 


. 2 + qu 
dictors than batteries of tests. It is ficien! 


tion of experience and training, P not 
and sense for relevant facts, and last 
Table 5 e 
. o Clas 
Dismissal Rate in Different Pilot Stanine C 
- pismis? 
Pil in dag 
ulot enle 
Stanine Employed Dismissed 
E. 
9-8 11 = 2p 
7 82 1 9.4% 
6 126 3 3.1% 
5 104 9 20.0% 
4 25 5 733% 
3-1 15 11 3 0% 
Total 363 29 


Psychologist as Instrument of Prediction 


least ability and courage to make an intelli- 
gent use of the tools of psychology. 


Received July 29, 1958. 


References 


Holt, R. R. Clinical and statistical prediction: A 
reformulation and some new data. J. abnorm. 
soc. Psychol., 1958, 56, 1-12. 


175 


Langewiesche, W. Are airline pilots any good? Air 
Facts, 1955, 18, 29-58. 

Trankell, A. Rekryteringen av piloter i Svenska 
flygvapnet. Tidskrift i Militär Hälsovård, 1956, 
1, 1-30. 

Trankell, A. Erfarenheter av en metod fér uttagning 
av piloter till Scandinavian Airlines System. M: ed- 
delanden fran Flyg och Navalmedicinska Nämnden, 


1956, No. 1. 


Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


A CODING SYSTEM FOR TOTAL PROFILE ANALYSIS 
OF THE STRONG VOCATIONAL INTEREST 
BLANK 


JOHN O. CRITES 


State University of Iowa 


Although Darley’s (1941) technique of pat- 
tern analysis of the Strong Vocational Interest 
Blank (SVIB) has been widely used in both 
clinical practice and research, the need for a 
coding system appropriate for total profile 
analysis of the SVIB has become increasingly 
apparent. As Kirk (1956, p. 309) has pointed 
out with reference to Darley’s method: “There 
is a need both in research and in counseling 
for total profile analysis not served by ad- 
herence to this system.” At present there is 
no method available which expresses both 
characteristics of the interest profile: the 
elevation and shape of the pattern (Cronbach 
& Gleser, 1953; DuMas, 1947). Interpreta- 
tions of interest profiles based upon standard 
scores, letter ratings, or Darley’s approach 
take into consideration the elevation of the 
interest pattern but not its shape. They ex- 
press the degree to which an individual's in- 
terests are similar to those of men engaged 
in occupations within an interest group but 
do not represent the configuration formed by 
the varying elevations across interest groups. 
That the shape of an interest pattern as well 
as its elevation may be Psychologically sig- 
nificant, however, and should be considered in 
the interpretation of SVIB profil 
suggested in a stud 
It would seem des 


Desiderata of a Coding System 
In addition to as isomorphic a representa- 
tion of the elevation and shape of an interest 
profile as possible, the coding system should 
ideally have other characteristics. Super- 
fluous meanings associated with the code 
symbols should be kept to a minimum; the 


definition of the profile must be as ce 
tional as possible. Codes should be caramu 
cable verbally to facilitate case contaron 
and other discussions of the cheat 
of individuals with given types of profi ‘a 
The coding system should be exhaustive © 
nature: all possible combinations of elevati it 
and shape should be codeable. aes 
should be comprehensive enough to om 
ready manipulation, such as in the ee 
tion and filing of profiles. And, the cae 
should be adaptable to use in research, a 
subject to at least a nominal level of quani 
fication for purposes of statistical analysis- ro- 
The proposed method of coding SVIB A to 
files which follows represents an attemp fter 
meet these criteria, It has been modeled yai 
the coding systems devised by artis 
(1947) and Welsh (1948) for the cee 
and therefore bears considerable resem how- 
to them. In one rather basic respect, un- 
ever, it differs. Whereas the assumption the 
derlying the coding and interpretation “ota 
MMPI is that “, . . the shape of the 
profile is of greater significance ee 
elevation of single scores” (Hathan 
Meehl, 1956, p. 137), the assumption he 
in the analysis of interest patterns S than 
SVIB is that elevation is more importan 


, ww re 
shape. The coding system outlined belo 
flects this distinction. 

The Coding System tem 


The basic structure of the coding A 
accounts for both the elevation and pes 
of the interest pattern. The glevan? g in 
SVIB profile is designated by classify pum” 
terest groups by their appropriate code 7 pn: 
bers according to type of interest eee jes 
primary, secondary, or reject. The inte jonal 
of the nonoccupational scales—occt EA and 
level (OL), masculinity-femininity (M much 
interest maturity (IM)—are coded 1” 


176 


ape 
he 


Strong Vocational Interest Blank 


the same manner as the interest groups but 
With slightly different score ranges than those 
equivalent to the letter ratings since the 
standard scores have different meanings on 
these scales (Strong, 1943). The shape or 
configuration of the profile is indicated by 
the order of primary, secondary, and reject 
Patterns from left to right positions in the 
code. Those interest groups in which pri- 
mary patterns occur are assigned to the first 
Code position, secondary patterns to the sec- 
ond code position, and reject patterns to the 
third code position. Thus, the elevation and 
Shape of the interest profile are represented 
m the code by type of pattern (primary, sec- 
ondary, reject) and position (from left to 
right), respectively. 
_ To determine primary, secondary, 
Ject patterns Darley and Hagenah’s (1956) 
revision of the former's method of interest 
Profile analysis is used. Instead of identify- 
ing primary, secondary, and tertiary patterns 
for all interest groups, including single-occu- 
pation groups, the newer procedure defines 
Primary, secondary, and reject patterns in 
Seven interest groups which include all single- 
Occupation groups with the exception of the 
Musician scale. The interest groups and their 
respective code numbers are listed in Table 1. 
© follow conventional terminology the code 
numbers correspond to the interest groups on 
the SVIB profile sheet rather than to con- 
Secutive numerical order. 
Unpatterned profiles, 
Mary, secondary, or reject pa 
een assigned a code position or number. 
Rather, they are indicated simply by the ab- 
Sence of code designations for the regular pat- 
terns, Provision for identifying unpatterned 
Profiles in the system was made because of 
their relatively high incidence in younger ase 
Broups and in groups of physically mature 
but less well-adjusted clients. By coding 
these uncrystallized patterns, impetus may be 
given to their further study and explication. 
Their meaning at the present time is quite 


uncertain. 


and re- 


those with no pri- 
tterns, have not 


Procedure in Coding @ Profile 


To code an SVIB profile the steps 
follows: 


are as 


177 


1. Determine primary, secondary, and re- 
ject patterns according to Darley and Hag- 
enah’s (1956) procedure. 

2. Form the profile code. List the num- 
bers of those interest groups in which there 
are primary patterns; set these off with a 
single prime (‘). Then list the numbers cor- 
responding to those interest groups which 
have secondary patterns; follow those with 
a double prime ("). Finally, list those in- 
terest groups which contain reject patterns 
and place a dash (—) after the last numeral. 
List lower numbers first when forming the 
code. 

3. Code the nonoccupational scales in the 
following manner: first, assign the symbols 
X, Y, and Z to represent the score ranges 60 
or above, 40 to 59, and 0 to 39, respectively ; 
and, second, form the code from left to right 
in the sequence OL, MF, IM (if specializa- 
tion level, SL, is coded, place it between MF 


and IM). 


To illustrate the procedure consider the 
SVIB profile formed by the letter rating in 
Table 1. First, the primary, secondary, and 
reject patterns are identified. In this in- 
stance, primary patterns occur in the biologi- 
cal science, physical science, and technical 
occupational interest groups; there are no 
secondary patterns; and, reject patterns 0C- 
cur in the business detail, business contact, 
and verbal-linguistic areas. Next, using the 
appropriate code numbers for each interest 
group the profile code is formed. For the 
illustrative profile the code is 124’890-. Fi- 
nally, the code symbols for the nonoccupa- 
tional scales are added. The scores for OL, 
MF, and IM, in the example, were 62, 44, 
and 39, respectively. Thus, the full code was 
124’890-XYZ. Examples of codes for some 
other possible profiles are: unpatterned = 
XYZ; single primary (business detail) = 
g/XYZ; secondary only = 289”"XYZ; reject 
only = 12590-NYZ. 


Discussion 


The proposed coding system approximates 
it should 


e characteristics which 
Tt is operational in nature and 
ent that the judgments of 


most of th 
ideally have. 
objective to the ext 


178 John O. Crites 
Table 1 
Code Numbers for Interest Groups on SVIB and Illustrative Profile 
y a 
Code Interest Group Occupational Scales Illustrative Profile 
a eż W: T: 
1 Biological Sciences Artist, Psychologist, Architect, Physician, C+, B—, B—, B+, A, A: 
J Osteopath, Dentist, Veterinarian 
2 Physical Sciences Mathematician, Physicist, Engineer, B=, B=,A,A 
Chemist F 
E aR 
4 Technical Farmer, Aviator, Carpenter, Printer, Math A, A, B—, A, A, B—, A BT 
Teacher, Ind. Arts Teacher, Ag. Teacher, B+, B 
Policeman, For. Sv. Man, Prod. Manager 
(Group ITI) 
G 
5 Social Service Y.M.C.A. Phys. Dir., Personnel Dir., Pub. B+, C, B, C, C+, © 
Administrator, Y.M.C.A. Sec’y, Soc. Sci. 
Teacher, City Sch. Supt., Minister 
8 Business Detail Sr. C.P.A., Acct., Off. Man., Pur. Agent, B+; GOGOR 
Banker, Mortician, Pharmacist 
9 Business Contact Sales Mgr., Real Est. Sales, Life Ins. Sales, ce cc 
Pres. Mfg, Concern (Group XI) 
0 Verbal-Linguistic Adv. Man, Lawyer, Auth.-Jr., C.P.A, GC, Gs 


(Group VIT) 


Code Symbols for Nonoccupational Scales on SVIB Profiles 


X Nonoccupational scale score range of 60 or above 

Mi Nonoccupational scale score range of 40 to 59 

Zz Nonoccupational scale score range of 0 to 39, eee 

= C letter ratings are to left of shaded area of SVIB profile sheet, 
k) 
i ‘ ; . er atter? 

primary, secondary, and reject patterns are Propriate and inappropriate interest P4 fa- 
reliable ( Darley & 


Hagenah, 1956). The 
coded profiles can be communicated verbally 
with a minimum of awkwardness and dis- 
tortion, although it may be too cumbersome 
to express other than the major characteris- 
tics of a profile such as abbreviating 124’g90_ 
XYZ to “one-twenty-four,” All possible pro- 
files can be coded and filed according to type 
of pattern. And, coded profiles can be used 
in research since they can be quantified on 
at least a nominal level of measurem 
either an individual has or does no 
particular SVIB pattern. 

A number of research possibilities using 
coded SVIB profiles are feasible. One line of 
investigation might be the identification of 
personality characteristics which are associ- 
ated with various combinations of primary, 
secondary, and reject patterns. Study of ap- 


ent, e.g., 
t have a 


in relation to the self-concept should be jes 
cilitated through the use of coded PF per 
(Super, 1954). Hypotheses concerning who 
sonality differences between individuals one 
have the same primary but different SUPP be 
ing secondary or reject patterns Possible 
testable when profiles are coded. } whi 
“conflict” patterns, such as 20’s or 59 “ictor 
represent opposing and often contra a 
occupational stereotypes and values, E ur 
extensively studied. And, the meaning 5 peit 
patterned profiles may become clearer 4 
correlates are identified. codi”? 

A concluding thought: before the m de 
system is used in research it would a ol 
sirable to have it fully evaluated and r nd- 
fied, if necessary, in order to promote ulti 
ard procedures in experimentation 29° 
mately, comparability of results. 


eee, y 
i = a —— ev 
—— 


> 


Strong Vocational Interest Blank 


Summary 


A coding system for total profile analysis 
of the SVIB was proposed which would rep- 
resent the elevation and shape of the interest 
Pattern as well as have other characteristics 
desirable for definition, communication, filing, 
and research. The basic structure of the sys- 
tem was outlined, the steps in coding a pro- 
file were delineated, and an illustration of the 
Procedure was given. Some possible areas of 
research using the coded SVIB profiles were 
briefly discussed. 


Received August 4, 1958. 


è References 
Ha J.O. Ability and adjustment as determinants 
% vocational interest patterning in late adoles- 
a _Unpublished doctoral dissertation, Colum- 

ia Univer., 1957. 


Cronbach, L. J., & Gleser, Goldine C. Assessing 


179 


similarity between profiles. Psychol. Bull, 1953, 
50, 456-473. 

Darley, J. G. Clinical aspects and interpretation of 
the Strong Vocational Interest Blank. New York: 
Psychological Corp., 1941. 

Darley, J. G., & Hagenah, Theda. Vocational inter- 
est measurement. Minneapolis: Univer. Minne- 
sota Press, 1956. 

DuMas, F. M. On the interpretation of personality 
profiles. J. clin. Psychol., 1947, 3, 57-64. 

Hathaway, S. R. A coding system for MMPI pro- 
files. J. consult. Psychol., 1947, 11, 334-337. 

Hathaway, S. R & Meehl, P. E. Psychiatric im- 
plications of code types. In G. S. Welsh & W. G. 
Dahlstrom (Eds.), Basic readings on the MMPI 
in psychology and medicine. Minneapolis: Uni- 
ver. Minnesota Press, 1956. Pp. 136-144. 

Kirk, Barbara A. Review of Vocational interest 
measurement. J. counsel. Psychol., 1956, 3, 309. 

Strong, E. K., Jr. Vocational interests of men and 
women. Stanford: Stanford Univer. Press, 1943. 

Super, D. E. The measurement of interests. Je 
counsel. Psychol., 1954, 1, 168-171. 

Welsh, G. S. An extension of Hathaway’s MMPI 
coding system. J. consult, Psychol. 1948, 12, 


343-344. 


Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


ART WORK VERSUS PHOTOGRAPHY: 
AN EXPERIMENTAL STUDY 


CHARLES WINICK 


Graduate School of Business, Columbia University 


Various attempts to measure the compara- 
tive effectiveness of photography and art work 
methods of illustrating advertisements have 
generally found photography to be superior, 
usually because of its greater realism (Which 
Ad Pulled Best?: 1947, 1949, 1950, 1951). 
A few studies have found art work superior 
(Which Ad Pulled Best?: 1950, 1951, 1952, 
1957). Photography illustrations of adver- 
tisements appearing in a business weekly 
(Best Read Industrial Advertisements, 1953), 
an industrial magazine (Starch, 1954), and 
a trade paper (DeWolf, 1954), have been 
reported to be superior to comparable art 
work illustrations. 

The playback method in which a respondent 
is asked what he recalls from a given adver- 
tisement in a specific issue of a magazine has 
also been used to measure the comparative 
efficacy of different methods of illustration, 
One measure used in Playback is Proved 
Name Registration, which is a score repre- 
senting the proportion of readers of a given 
issue of a magazine which can confirm having 
Seen an advertisement by recalling one or 
more of its copy points. In food factorization 
on Proved Name Registration, there was a 
22% penalty for sketch art compared with 
photography, Advertisements with four color 
photo realism averaged 23% in playback of 
beauty, whereas advertisements with sketch 
art averaged 15%, 


Procedure 


In order to get empirical dat: 
effect of photographs and art work as methods of 
illustrating consumer magazine advertising, it was 
decided to use the accordion method of paired com- 
parisons, in which two paired groups of subjects 
would each be shown an accordion folder. Each 
folder consisted of a series of four advertisements, 
with the dimension of art work and Photography 
varied in only one of the four advertisements and 
everything else held constant: 


a on the comparative 


Accordion Folder 1 
Accordion Folder 2 


NN 


X Y A 
xX Y A 


180 


Thus, in the Accordions 1 and 2, the advertisemen p 
X, Y, and Z, would be exactly identical a have 
accordions. Advertisements A and A’ woul by att 
the identical text, but A would be illustrated "sine 
work and A’ by a photograph of exactly the ae 
situation or scene. In order to maximize C0 rotOB- 
bility, experienced commercial artists ana Pa re- 
raphers did both photographs and art wer? n from 
quired. The 4 test advertisements were taken nts! 
a 1955 issue of Life magazine, and the agazine. 
advertisements also came from the same keke y 
Three were illustrated by photographs an 


> sb actually 2P° 
art work, in the advertisements which act weti 


‘ons 
Pe vtustratio 
advertisements which had an art work m for! 
For each advertisement, there were thus d onè 


of © 5 
coffee illustration (D and D’) was of a jertisemen 
si 


: i. dy- yas 
of advertisements under experimental stu! adults © 
i 


i in 
The subject population was divided gs ‘Jy’ 
matched groups of 481 cach. Tie sri fam 
matched on age and on socioeconomic ups 
status. Each respondent in the two Bre oto tt 
version of cach advertisement. 
in the accordion were presented in S 
varied arrangements (k!), in order to bie 
bias which might be created by following 
sequence of presentation for all Ss. 

The Ss were asked to rank each of pee 
tisements in the accordion presented to t 


yst 
Me ante ge 
jie a™ 


„ad 
fou! A prsh 
a 


Art Work Versus Photography 


second, third, and fourth, on the dimensions of which 
advertisements they liked the most and which adver- 
tisements they felt were easiest to believe. The 
Sequence of these two questions was rotated in order 
to avoid any possible bias. After these two ques- 
tions had been asked, the interviewer then took the 
accordion away and the Ss were asked to describe 
everything they recalled in each advertisement. The 
number of percepts recalled correctly from each illus- 
tration was counted and the advertisement which 
had the greatest number of percepts correctly recalled 
was given Rank 1, the advertisement with the second 
greatest number of percepts Rank 2, and so forth. 
This provided the third dimension of recall, on which 

e two groups of advertisements were ranked. 

The number of ranks, from 1 to 4, which each 
advertisement had on each of the three dimensions 
of: (a) liked most (an inferred measure of impact), 

) most believable, and (c) recall, was tabulated. 

fter totalling each advertisement’s rank, a chi-square 

test was conducted in order to test the significance 
of the differences between the two versions of each 
Paired advertisement on each of the three dimensions 
Studied. 
Pics to the nonparametric nature of the data and 
i c nature of the ranking process, it was possible 
© group and reclassify the data in order to get a 
more meaningful test of underlying patterns without 
interfering with the assumptions of the chi-square 
test. Ranks 1 and 2 on each dimension were grouped 
together into one larger rank, as were Ranks 3 
and 4. The four ranks thus became two ranks, and 
Were compared with cach other. The chi-square test 
Was used to determine the significance of the differ- 
ence between the groups of combined ranks. 


Results 


The chi-square and p values obtaine 
shown in Table 1. 
_ By and large, an illustration for an adver- 
tisement which ranks high in one of the three 
imensions studied appears to have some 
tendency to rank high on the other two di- 


d are 


181 


mensions, suggesting the operation of a kind 
of halo effect. 

In only one of the four test advertisements 
(A and A’) did art work appear to be pre- 
ferred, and then only on the dimension of 
believability ( < .05). This was perhaps 
due to the humorous and attention getting 
nature of the cartoon illustration showing 
running dogs. This advertisement’s feeling 
of movement would probably help it to stand 
out from the pages of the magazine as a 
reader rifled through them. It would prob- 
ably have, therefore, obtained more initial 
attention from an actual reader of the maga- 
zine than was possible in this experiment. 

The most clear-cut superiority of photog- 
raphy over art work was in the second adver- 
tisement; on the dimension of most liked, 
the photograph was preferred to art work 
(p <.001) and on believability and recall 
the degree of preference was also significant 
(p<.01). This result was in line with cur- 
rent advertising practice, which has empha- 
sized the effectiveness of photographs of food 
and drink. 

The third advertisement, with an illustra- 
tion of a stenographer smiling in an office, was 
more believable in the photography version 
(p < .02) than its art work version. It might 
be speculated that this result occurred because 
interiors presented by photographs permit 
relatively rapid recognition and pemit the per- 
ceiver to relate them to previous experience. 

The fourth advertisement, with a relatively 
realistic scene of a man with a cup of coffee, 
also displayed a high degree of superiority for 
photography over art work, on the dimension 
of most liked ($ < .001). On the dimensions 


Table 1 


‘alues for Ranked Paired Ad 


vertisements on the Dimensions of Mos 


t Liked, 


Chi-Square and p Vi Believability, and Recall E = 
_— 
7 Most Liked Believability Recall 
aT Chi Chi 
n Chi square p 
Sia me toO Oa 
j A. Gasoline To -106 sa Reon a 7923 ‘Ol 
' OARDER 19.550 ; os 2 967 >.50 
B: Soft Drink 1.008 >.50 5.928 A me 30 
a Tooth Paste 28.100 ‘001 312 ; i 
. Coffee 3 


182 


of believability and recall, the degree of supe- 
riority of photography was not statistically 
significant (p < .10). : : 

Sex and socioeconomic status did not ap- 
pear to be significantly associated with a 
preference for any of the four pairs of adver- 
tisements shown or with the three variables 
studied. 


Discussion 


On the basis of these four pairs of adver- 
tisements, it would appear that photographic 
representations of human beings, edibles, and 
interiors, are easier to identify with and recall 
details of, and provide a clearer and more 
realistic visual demonstration of their subject 
matter’s content and meaning than does an 
art work representation. 

Art work would appear, on the basis of 
these results, to be particularly appropriate 
where an unusual kind of attention getting 
illustration is needed. Other areas of possible 
art work superiority would be the difficulty 
of photographing certain situations, or the 
difficulty of photographing something which 
has yet to happen or which has already hap- 
pened. It is of course fallacious to assume 
that there is any one kind of photographic or 
any one kind of art work illustration. There 
are many different ways of photographing an 
object or person, and some art work may 


even be more representational than some 
photographs. 


Summary and Conclusions 


Four paired advertisements, one using pho- 
tography and the other using art work, were 


Charles Winick 


shown to a matched sample of 962 adults in 
the New York City area. The Ss ranked 
each advertisement on the dimensions of most 
liked, believability, and recall. No sex 0 
socioeconomic differences emerged. Statist 
cally significant differences were found for the 
photographic version of three advertisemen’ 
respectively showing a man, a woman in a 
office, and a man drinking coffee. Art Wola 
was favored in one advertisement, which sem} 
humorously showed a dog in motion. d 

Any decision to use either art work oF ae 
tography for a communication depends N 
many factors, including the object to bi a 
produced, the medium of communication: -g 
effect desired, and the associated text. nts 
results of this study, based on advertise i 
from one consumer magazine, must be im 
preted with caution. 


Received August 18, 1958. 


References 


Best Read Industrial Advertisements ane 
Outpull Drawings. Industr. Marketing, 
(10), 174. 

DeWolf, J. Why these trade ads have z 
pact. Printers’ Ink. 1954, 246(9), 42-4 ape 

Starch, D., et al. Tested Copy. 1954, NO- 2190) 

Which Ad Pulled Best? Printers’ Ink, 1941 #949, 
40; 1949, 228(7), 36; 1949, 229(5), 97%. 33; 
220(8), 42; 1950, 232(3), 37; 1951, 23 ; 
1951, 237(2), 58. (12)! 

Which Ad Pulled Best? Printers’ Ink, 1950 238,08 
27; 1951, 237(5), 58; 1952, 239(5)» 37 
258(4), 45. 


visual 3” 


Journal of Appli 
Vol. 43, epia Peveheloas 


SELF-PERCEPTIONS OF FI 
RST-LEVEL SUPERVISORS 
COMPARED WITH UPPER-MANAGEMENT PER- 
SONNEL AND WITH OPERATIVE LINE 
WORKERS 


LYMAN W. PORTER 


University of Ci alifornia 


Mod apay daa position of the first-level 
Biren’ te or foreman, has long been an im- 
Eam oF tome personnel psychology. The 
ingen of just where the foreman fits into 
to ii ta an setup and how he relates 
ie lie other positions in the organization 
ways eeri in a number of different 
oe oT Whyte, 1945; Roethlis- 
a 945; Turner, 1954; Wray, 1949). 
liema mple, Roethlisberger, in discussing the 
with pe Situation when he is interacting 
“Nowh, aer types of employees, notes that: 
than mye in the industrial structure more 
8. diser the foreman level is there so great 
© be epancy between what a position ought 
count and what a position is. This may ac- 
which part for the wide range of names 
“infor orenién have been called—shall we say 
of oe >and the equally great variety 
en nitions which have been applied to 
ace a more strictly formal and legal 
to th (1945). Roethlisberger himself refers 
double foreman as “master and victim of 
all h e talk” and “victim, not monarch, of 
refer. surveys.” Gardner and Whyte (1945) 
le,» to the foreman as “the man in the mid- 
ah and Wray (1949) has called the fore- 

ane “marginal man of industry.” 
of these investigators point to the fact 


th, $ 
ye the foreman occupies a unique position 
e formal organization. On the one hand, 
ment and re- 


ka ply a part of manage! 
i is orders and directions from manage- 
With , When he interacts upward, he interacts 
inter: other management personnel. When he 
ERN downward, however, his position is 

aes from that of any other in manage- 

ae the employees he directs are 
— line employees, and, therefore, non- 

agement, ‘Thus, he directs people who 


re 
not part of his own group OF part of the 


group to which his superiors belong. For this 
reason his position is unique among manage- 
ment positions. 

‘Another distinctive feature of the first-level 
supervisor’s position is the fact that typically 
he was formerly a part of the operative group 
which he now must direct. Other members 
of management usually enter the organization 
as part of management and continue in that 
same capacity. They do not have to change 
their allegiance as they advance, and they 
continue to supervise from within the same 
group in which they started. This means that 
has additional personnel relations 


the foreman 
k not faced by most 


problems in his wor 


upper-management personnel. 
The above considerations indicate that the 


first-level supervisor has an involved task in 
gaining the approval of those with whom he 
works. If he tries to follow directives from 
above in such a way as to give maximum 
satisfaction to upper management, he may 
decrease his popularity and effectiveness with 
the men he supervises. On the other hand, 
if he tries too much to carry out his duties 
in a way that will most please those under 
him, he may not receive the maximum ap- 


proval from his management superiors. Any 


person who supervises ‘and is supervised faces 
this problem to some extent, but it becomes 
most acute for those at the foreman level. 
The foreman’s position, then, is one that is 
psychologically perhaps the most difficult in 
the entire organization. Because of the varied 
nature of the expectations that others hold 
for him, the foreman does not have an easy 
task in trying to maintain a clear-cut self- 
perception. Since a person’s self-perception 
is probably strongly influenced by the role 
demands he perceives operating in his work 
situation, it may be instructive to compare 


183 


184 


Lyman W. Porter 


the self-descriptions of first-level supervisors 
with those above them (upper management) 
and with those below them (line workers). 
` The present study is concerned with these 
comparisons. 


Method 


The instrument used in this study to obtain the 
self-perceptions was a 64-pair forced-choice adjective 
check list developed by Ghiselli and used in previous 


studies 


(Ghiselli, 1954; 


Ghiselli, 1957). 

The self-description inventory was completed by 
172 first-level supervisors, 291 upper-management 
personnel, and 320 operative line workers. For the 
purpose of this study, “first-level supervisors” are 


Table 1 


Porter, 


1958; Porter & 


Items Differentiating Upper-Level Management 
Personnel and First-Level Supervisors 


First-Level 
Supervisors 


Upper-Level 
Management 
Personnel 


Ş 


See themselves as: 
planful 
deliberate 
calm 
fair-minded 
steady 
responsible 
civilized 
self-controlled 
logical 
patient 
honest 


moody 
stubborn 
conceited 
stingy 
touchy 
dreamy 
nervous 
careless 
egotistical 
evasive 
selfish 
self-centered 
disorderly 
fussy 
opinionated 
excitable 
impatient 


nol see themselves as: 


See themselves as: 
resourceful 
sharp-witted 
sincere 
thoughtful 
sociable 
reliable 
dignified 
imaginative 
adaptable 
sympathetic 
generous 


Do nol see themselves as: 


affected 

cold 
infantile 
shallow 
defensive 
dependent 
intolerant 
foolish 
apathetic 
despondent 
weak 

tude 
rattle-brained 
submissive 
pessimistic 
sly 
irresponsible 


particular organizations. “Operative line vor 
are all those who are on the bottom level e 
organization and who have no supervisory E 
and “upper management” consists of all manage a 
personnel above the first-level supervisors. The upp 


men 
ists of top manag? 
management group thus consists R officers o 


people— 
q various» 


* ive 
defined as those who directly supervise one 
line workers, regardless of the specific title used “s, 

| 


people—presidents, vice-presidents an 
similar rank; and middle management 
operating division and department heads am d pur 
staff personnel such as personnel managers aN draw? 
chasing agents. All three samples of Ss were ap ical 
from organizations heterogeneous as to goon the 
oy 
to 


\ 
$ 


location and nature of enterprise. Aitbont 
inventory was administered in a variety of he pat 
stances, none of the Ss was familiar with t vers 
ticular uses that would be made of his ane at no’ 
the check list. Therefore, it is probable S olf- 
systematic “set” other than that of aconta y of 
description was operating for all Ss wibi. study: 
the three personnel categories formed for et othe! 
However, any given S may have used a f escrip” | 
than a self set, and hence the composite sii a 
tions obtained for each category may ?° 
of persons having somewhat varying sets 


Results and Discussion pat” 
The responses of the 783 individua : each 


ticipating in the study were analyzed $°}, pre- 
of the 64 pairs of adjectives. Table k 
sents the 28 pairs that differentiate anag” 
first-level supervisors and upper-leve fide” 
ment personnel at the .05 level of Compost" 
or better. Eleven of the pairs are i of 
of favorable adjectives, and the other the 
unfavorable terms that “least qesi ite™ 
individual. Table 2 presents the ef jo" 
that differentiated the first-level SW" ¢ gente 
from line workers at the oa irs a 
level or better. Nine of these r fi 
composed of favorable adjectives; 4” k 
“least descriptive” unfavorable traits: pap?" 
As has been noted in a previo" 
when results are presented in t 
“the differences are relative, and 
sarily indicate that one adjective wE 
was favored by the majority of 
and the other adjective by a maj 
other group” (Porter, 1958)- In act" 
stances, a majority of each 
favored the same adjective, 
the majority was significantly 
of the groups. Therefore, one P adje ie" 
tively more often chose one of the nore a 
and the other group relatively j | 


Self-Perceptions 


chose the other adjective. Also, when a per- 
5 selected one word in a pair, he was not 
ecessarily rejecting the other word in the 
a | me was only indicating that the chosen 
the i more or less descriptive of him than 
aka, word. Additionally, it should be 
Gabel ae that the specific list of traits 
fei) in the comparisons was in part a 
th on of the specific words contained on 
check list. However, 128 adjectives (64 
3 words) constituting a wide range of 
the - traits were available for choice, and 
oe obtained should not be strongly 
a a due to a particular sampling of words 
ntained in the inventory. 
i of Table 1 reveals that the 
si supervisors, if contrasted with upper 
fein Bement, tended to perceive themselves in 
oe that indicate a careful and controlled 
approach toward their job and toward other 
a in their work environment. Among 
tive} avorable adjectives, the supervisors rela- 
a yely more often checked words like “planful,’ 
ee “calm,” and “self-controlled,” 
for on upper-level management personnel 
che se same pairs relatively more often 
ae ‘resourceful,” sharp-witted,” “sin- 
f e”? and “imaginative.” Among the un- 
Vorable adjectives, traits such as “moody,” 
stubborn,” “careless,” “evasive,” “disor- 
cent “fussy,” and “opinionated,” seem to 
ar ze the type of person that a super- 
Sor does not see as himself, when the com- 
ie is with upper-level managers. Un- 
i orable traits relatively more often checke 
by the higher-management personnel for these 
Same pairs include “affected,” “cold,” “fool- 
'sh,” “despondent,” “rattle-brained,” “sub- 
missive,” and “pessimistic.” The favorable 
and unfavorable traits taken together show 
that the typical supervisor sees himself as a 
More conservative person than does the typi- 
cal upper-management person. The super- 
visors seldom tended more often to check 
an adjective that indicates independence OY 
Strong aggressiveness. The upper-manage- 
ai personnel, on the other hand, seemed 
th atively more frequently to check adjectives 
at gave a picture of greater enterprise, 0181- 
nality, and boldness. 
6 results presented i 
e self-perceptions of t 


n Table 2 show how 
he first-line super- 


185 


visors compared with those of the people they 
direct, the operative line workers. It can be 
seen from Table 2 that the items relatively 
more characteristic of supervisors in this com- 
parison are ones that give somewhat the same 
picture of these individuals as was found in 
Table 1, even though their descriptions are 
now being contrasted with an entirely different 
group. Nine of the 14 differentiating items 
in Table 2 were also items found in the Table 
1 comparisons, and on six of these nine items 
the supervisors differed in the same way from 
operative workers as they did from upper 
management. In other words, on a majority 
of items that were constant to both compari- 
sons, there was not a trend from upper man- 
agement to supervisors to operative workers; 
supervisors, instead, seem a group set apart 
in the same way from both men above and 
below them in the organization. They more 
often saw themselves as deliberate, fair- 
minded, steady, responsible, logical, and not 
instead of sharp-witted, thoughtful, 
adaptable, and not depend- 
d either to management or 
Only on rude-self-centered, 
derly, and submissive-fussy 


dreamy, 
sociable, reliable, 
ent, when compare 
to line operatives. 

rattle-brained-disor 


Table 2 


Items Differentiating First-Level Supervisors 
and Line Workers 


First-Level 


Supervisors Line Workers 


See themselves as: See themselves as: 


energetic ambitious 
practical industrious 
deliberate sharp-witted 
clear-thinking efficient 
fair-minded thoughtful 
steady sociable 
modest pleasant 
responsible reliable 
logical adaptable 


Do not see themselves as: Do not see themselves as: 


dependent 


dreamy 

rude self-centered 
rattle-brained disorderly 
submissive fussy 
cynical aggressive 


186 


did supervisors differ from line operatives in 
the same direction as upper-management per- 
sonnel had differed from them. Just as super- 
visors picture themselves in conservative and 
careful terms in comparison with upper- 
management personnel, they likewise tend to 
picture themselves in these same terms in 
comparison with operative personnel. There 
are other areas in which they also seem to 
differ from line workers. They do not ap- 
pear as concerned with being gregarious and 
friendly, and also seem less submissive and 
flexible. In short, foremen seem to be espe- 
cially conscious of a supervisory role. 

The study as a whole tends to show that 
the self-perceptions of supervisors reflect their 
unique position in the structure of organiza- 
tions. Their self-descriptions show certain 
differences from those of men they direct, 
but they also show somewhat the same differ- 
ences from those of men who direct them. 
Supervisors do not differ from subordinates 
in the same way that their superiors differ 
from them. Since their number one duty is 
direct supervision, their self-perceptions may 
be more acutely affected by the role demands 
of this type of activity than are the self- 
Perceptions of upper-management people who 
also have supervisory duties but in addition 
have other more general administrative func- 
tions. If the role demands are largely respon- 
sible for shaping the self-perceptions of first- 
level supervisors, then the self-descriptions 
provide data on how the supervisors interpret 
the role demands, The general picture of 
cautious individuals that seemed to emerge 
from the findings may be indicative of the 
psychological position as well as the strictly 
formal position in which they see themselves 
in their organizations, Being “marginal men” 
or “men in the middle,” both formally and 


psychologically, they may reflect this situa- 
tion by seeing themselves as individuals who 
act with restraint in carrying out their super- 
visory functions. 


Lyman W. Porter 


Summary 


The self-perceptions of 172 first-level stad 
visors were compared to those of 291 wa 
management individuals and to 320 a i 
line workers. Ss were employed by a W Fe 
variety of industrial and business one 
tions, with the self-descriptions being re 
by administration of a 64-item forced-c an 
adjective check list. The items that bane 
tiated between supervisors and upper a 
ment personnel tend to show that fo 
view themselves as more conservatie ose 
cautious individuals in comparison with 


ice 


i super | 
above them in management. hice: the | 
visors’ self-descriptions are compare rkers; 


self-descriptions of operative line bai 
similar results occur; supervisors appr ine 
view themselves as more careful and rest" here 
individuals than do operative workers. 
thus does not appear to be a consisten nagets 
in self-perceptions from upper-level ma supet 
to supervisors to line workers; instead, that 
visors’ self-perceptions seem to shoy ewhat 
these men are a group different in oe them 
the same way from both those abo? ional 
and those below them in the organiza 
hierarchy. 


t tren 


Received August 25, 1958. 


References he 


spe man in 
Gardner, B. B., & Whyte, W. F. The ma ore 


middle: Positions and problems of the di- 
Appl. Anthrop., 1945, 4, 1-28. à in 5 
inc, E. E. te forced-choice technique 1-20 
description. Personnel, Psychol., 1954, f o ma 
Porter, L. W. Differential self-perception: J. apt 
agement personnel and line workers. a 
Psychol., 1958, 42, 103-108. 5 perception, 
Porter, L. W., & Ghiselli, E. E. ‘The self PO" pe 
of top and middle management perso” 
Sonnel Psychol., 1957, 10, 397-406. ter 
Roethlisberger, F, J. The foreman: Mas 1945 
tim of double talk. Harv. bus. Rev» 


Je- 
283-298. pe ort 
Turner, A. N. Foremen—key to WO0 fort” 
Harv. bus. Rev., 1954, 32, 76-86. phe 


Wray, D. E. Marginal men of indus i 
men. Amer. J. Sociol., 1949, 54, 29 


Tournal of Appli 
Vol. 43, e 


SUBLIMINAL PERCEPTION: SOME NEGATIVE FINDINGS 


ALLEN D. CALVIN 


Hollins College 


anp KAREN S. DOLLENMAYER 


Northwestern University 


ee ie ae reports in the popular press con- 
of eo the claim that stimuli below the level 
ior oem bes awareness can influence behav- 
both C > ad remarkable repercussions with 
roble re, and the FCC investigating the 
Banizatio, The claims by a commercial or- 
ateism a that they had succeeded in in- 
of ees e sales of Coca-Cola 18% and that 
méssare rn over 50% by flashing subliminal 
ioù alet at 1/3000 of a second during a mo- 
BF sorts ure program was the principal cause 
i cape The major television networks 
forest esponded by banning the use of such 
ques. 
a et al. (McConnell, Cutler, & 
cellent , 1958) have just completed an ex- 
ent comprehensive review of the experi- 
The al evidence relating to this problem. 
a H conclude by saying, “One fact emerges 
utili all of the above. Anyone who wishes to 
mi ize subliminal stimulation for commercial 
oe purposes can be likened to a stranger 
hers into a misty, confused countryside 
Pie 4, there are but few landmarks. Before 
it 3 technique is used in the market place, if 
of is to be used at all, a tremendous amount 
research should be done, and by com- 
Petent experimenters” (p- 237): 


Method 


female undergraduate students 


S Subjects, 
Served as Ss. 
tot Rivets A Gerbrands’ tachistoscope- The lamps 
tachi: or illumination in our model of the Gerbrands 
T are 4 watt, daylight, fluorescent lamps. 
abpro lamps are operated on an jgnition voltage of 
(a eg 250 volts d.c. and a filament voltage 
cach ] volts a.c. The normal operating current of 
Our lane is approximately 130 mils. There are 
fielq pr in the tachistoscope, 2 for the exposure 
Po nd 2 for the pre-exposure fiel a a t 
A e. The Ss were seen in individual experi- 
instry sessions where they were given the following 

Ctions: 


Sixty 


We are interested in investigating the possibility 
of ESP (telepathy). I want you to look into this 
machine [Gerbrands’ tachistoscope]. Do not re- 
move your eyes from the machine until I tell you 
to do so. In the machine you sce a card with two 
circles on it. The left one is marked L and the 
right one is marked R. We want to find out if 
you can guess which of the two circles is correct 
on a particular trial. The correct circle will be so 
designated on the back of the card, but you, of 
course, will not be able to see the designation. 
After I say “ready” you will hear a click. I want 
you to tell me after the click whether you think 
the left or the right circle is correct for that trial. 
If you think that the left one js correct, say “left,” 
and if you think that the right one is correct, say 
“right.” 

Half of the Ss were then told: 

I will tell you if your choice is correct or in- 
correct. There will be ten trials. Between trials 
there will be a brief pause while a new card is in- 
serted. Are there any questions? k 


With each click the words “choose left” or “choose 
right” made up of block letters } of an inch high 
were flashed in the center of the screen. Whether 
“choose left” or “choose right” was flashed was pre- 
determined by a Gellerman (1933) order. The cor- 
rect circle on any trial was, of course, the one flashed 
jn the message. At the conclusion of the experi- 
mental session, each S was asked for a verbal report. 

A three by two factorial design was used. The Ss 
were assigned randomly to the conditions such that 
an equal number of Ss were in each condition, There 
were three exposure speeds, .01 second, .02 second, 
and .03 second, and half the Ss at each speed were 
told when they were correct (hereafter referred to as 
the TWC Ss) while the other half (NTWC) were 
given no knowledge of the correctness of their choices. 


Results and Discussion 
of correct choices for 


The mean numbers 
ted in Table 1. 


each group are presen 
‘An analysis of variance was conducted and 


the over-all F was not significant. Other 
analyses indicated that none of the groups 
differed significantly from chance which, of 
course, was five correct choices. 

Although none of the groups exceeded 


187 


188 


Table 1 


Mean Number of Correct Choices 


Group 01 02 03 
TWC 49 5.0 54 
NTWC 41 5.0 5.8 


chance expectations, four Ss made nine or 
more correct choices. Making nine correct 
choices out of 10 is significant at about the 
1% point if the S had been selected before- 
hand. All four Ss indicated during their ver- 
bal report that they had been able to read the 
words. One of the four high-scoring Ss was 
in the TWC .02 group, one in the TWC 03 
group, one in the NTWC .02 group, and one 
in the NTWC .03 group. 

In addition to the four mentioned above, 
six other Ss reported that they could read the 
words although usually not until the later 
trials. Their scores were 5, 6, 7, 7, 8, and 8. 
The rest of the Ss did 
thing except the circles, 

Our results thus indicate that under the 
conditions of the present 
dence for subliminal 
This obviously does n 
perception cannot oc 
tions, and since the 


not report seeing any- 


Allen D. Calvin and Karen S. Dollenmayer 


refuses to release the details of its experi- 
mental procedure (McConnell, 1958), no T 
rect comparison between their findings an 
ours is possible. Nevertheless, it seems rea- 
sonable to assume that the striking findings 
claimed by the commercial organization at an 
exposure speed of 1/3000 of a second are due 
to some artifact and are not a genuine M- 
stance of subliminal perception. 


Summary 


Sixty female undergraduates served a z: 
in a study designed to investigate able 
Perception. Speed of stimulus presentation 
and knowledge of results were varied in f 
three by two factorial design. No evidence i 
subliminal perception was obtained. ImP 
cations of these findings were discussed. 


Received September 3, 1958, 


References 

Gellerman, L. W. 
stimuli in visual 
genet. Psychol., 
McConnell, J. V. 


Chance orders of alternating 
discrimination experiments. 

1933, 42, 207-208. - 
Subliminal stimulation: erie 
developments. Paper rea! 

» D. C., September, 1958. 
: Vo Cutler, R. Lọ, & McNeil, 


mulation: An overview. Amer: 
chologist, 1958, 13, 229-242, 


E. D- 
Psy- 


Journal of Applied 
Vol. 43, ware ied Psychology 


EVALUATION OF TRAINING IN CREATIVE 
PROBLEM SOLVING’ 


ARNOLD MEADOW ann SIDNEY J. PARNES 


University of Buffalo 


i method widely employed in in- 
dal government, and education is the 
Osborn. problem-solving method outlined by 
signed — The present study was de- 
ta a fh provide a systematic experimental 
iA ereañ e effects of a 30-hour training course 
born’s se problem solving which utilizes Os- 
Pek mia and related methods. 
creative ee of the literature in the area of 
vant eee indicates four groups of rele- 
attempti a A first series comprises studies 
Gentiva to differentiate creative from non- 
nitive { individuals by means of tests of cog- 
and by oe by personality measures, 
Educati lographical data analysis (Creative 
A a Foundation, 1958). 
effects — series attempts to determine the 
Producti various factors postulated to inhibit 
evaluati = thinking. Among these are studies 
ality aan the effects of pathological person- 
Sen = pena experimentally induced anx- 
port. Gil experimentally induced set (Rapa- 
Sid , & Schafer, 1945-46; Youtz, 1955). 
Proble les comparing individual and group 
Tayl solving procedures (Lindzey, 1954; 
Nemar Berry, & Block, 1957; Taylor & Mc- 
Ure r, 1955) and studies evaluating a lec- 
ne a workshop in creative thinking 
19577, DeVeau, & Chorness, 1957; True, 
of li comprise the third and fourth bodies 
iterature. 


Hy, potheses 


i the course at the Universi 
are e in Osborn’s text ) 
=e described, and students are given practice 
S Taa application (Parnes, 1958). The brain- 

ng principle is emphasized throughout 


the K 
is oe The basic thesis of this principle 
at creativity is encouraged by the tem- 
formation and 


Por, 
“tal segregation of hypothesis 


ty of Buffalo 
book (1957) 


1 
Creat study was financed by @ grant from the 
tion ahs Education Foundation. The JBM Corpora- 
Wired pvided the programing ani 

y the statistical analysis. 


d computations re- 


the judicial evaluation of the adequacy of hy- 


potheses. 
In the attempt to evaluate the effects of 


the course in creative problem solving, three 
hypotheses were proposed for experimental 
testing: the method employed in the course 
produces a significant increment (a) in 
quantity of ideas, (b) in quality of ideas, 
and (c) in three personality ‘variables— 
need achievement, dominance, and self-con- 
trol. The variables embodied in these hy- 
potheses were selected on the basis of a search 
of the literature for measures reported to dis- 
criminate creative from noncreative individu- 
als (Creative Education Foundation, 1958). 


Method 


Experimental Design 

The three hypotheses were tested by administering 
a battery of psychological tests comprised of 11 
measures to students taking the Creative Problem 
Solving courses in the School of Business Adminis- 
tration and to control groups of Ss taking other 
courses in the same school. The basic design of the 
experiment is depicted in Table 1. 

The experimental group consisted of a total of 54 
students in three Creative Problem Solving courses. 
Two were evening sections; the other was a day sec- 
tion. Since total pre-post testing time required four 
hours, it was not practicable to administer all tests 
to one control group. Two control groups were ac- 
cordingly employed. Those measures of the battery 

‘dered to be tests of ability were ad- 


which were consi 
ministered to Control Group A. Control Group B 


received those tests which were considered to be per- 
sonality measures. The one exception to this pro- 
cedure was the Thematic Apperception Test (TAT) 
Originality ability measure which was included in 
the Control Group B battery because the total num- 
ber of ability tests was too great to be administered 
during one testing period. 

Each experimental S was matched with an S from 
each of the two control groups on the basis of age, 
sex, and Wechsler Adult Intelligence Scale (WAITS) 
score (Wechsler, 1955). In order to in- 


Vocabulary 
crease the accuracy of matching, the initial number 
of control Ss tested was 200. Completion of the 


Jded a total of 54 Ss for the experi- 


matching yie 
54 Ss for each of the two control 


mental group and 


189 


190 Arnold Meadow and Sidney J. Parnes 
Table 1 
Design of Experiment 
Experimental Control Control 
Group Group A aes 4) 
Pre-post Test Measures (N = 54) (N = 54) 
1. AC Test of Creative Ability—Other Uses (quantity) x x 
2. Plot Titles Low (quantity) x : 
3. Guilford Unusual Uses (quality) x X 
4. Apparatus Test (quality) ; X x 
5. AC Test of Creative Ability—Other Uses (quality) xX x 
6. Plot Titles High (quality) s : x x x 
7. Thematic Apperception Test—Originality (quality) x % 
8. Thematic Apperception Test—Need Achievement X oF 
9. California Psychological Inventory—Dominance Scale x k 
10. California Psychological Inventory—Self Control Scale x . X 
11. Wechsler Adult Intelligence Scale—Vocabulary* X x O 


2 Pretest only—for matching of experimental and control groups. 


groups. The experimental and control groups were 
closely matched on the selected variables. Ages for 
the experimental group ranged from 17 to 51 years, 
for Control Group A, 17 to 50 years, and for Control 
Group B, 18 to 42 years. For the experimental and 
control A groups the average of the differences in 
age for the 54 matched pairs was 3.6 years; the ay- 
erage of the differences in weighted WAIS Vocabu- 
lary score was .60. For the experimental and control 
B groups the average of the differences in age of the 
54 matched pairs was 3.8 years, and the average of 
differences in WAIS Vocabulary score was 68, 

Of the final experimental group sample, 42 were 
male; 12 were female. The final Control Group A 
and B samples each consisted of 48 male and 6 fe- 
male Ss. Of the 54 Experimental vs, Control Group 
A matchings, 38 were of the same sex, The corre- 
sponding number of Same sex matching for Experi- 
mental and Control Group B was 40, 

Tests were administered to all Ss as groups in their 
regular classes at the beginning 
mester. Three class sections wer 
experimental Ss; ten sections were needed to attain 
the necessary number of control Ss, 


Experimental Instructions 


Each instructor introduced 
class by describing it as a uni 
which would “not have any 
grades.” The test administra’ 
to the class. 


the experiment to his 
versity research Project 
thing to do with your 
tor was then Presented 


Instructions given at pretest session at beginning 
of semester. I think you will find interesting what 
you are asked to do. Sometimes the nature of 
the task may seem strange or silly. Nevertheless, 
please cooperate to the fullest extent inasmuch as 
everything you are asked to do is highly signifi- 
cant. 


nd 

Instructions read at posttest session at the be 
of semester. All of you are subjects in an «have 
ment designed to measure changes which me our 
occurred in your thinking as a result of all Y 
course work at the University this semester. 

During this period you will be given a b 
test, consisting of a series of tests similar 
ones given the first time, 

Your instructor, Mr, , is interested other 
ing how well cach one of you does. On the tests 
hand, as explained before, the results of mg 
will not go on your record, or have anytai 
do with your grades. It isa serious study al 
will provide some interesting scientific Lie 
we would appreciate your sincere cooperation. use 

In the tests you will now take, you may fore 
any answers which you may have used int ÍS 
and/or any new answers. The important p° 


resen 
to get as high a score as possible on the P 
test. 


ost- 
the 


jn see- 


Scoring 


ent 
All measures were scored by two independ gs 
raters. Protocols were coded so that no i o 
aware of whether he was rating the protoce 
control or an experimental subject. 

Pearson correlation coefficients between t s whi 
of these raters were computed for all meme ere 
required qualitative ratings. Computation 55: 
based on a randomly selected sample E Need 
Correlations ranged from .691 on the eae 
Achievement to .993 on Guilford’s Unusual were 

Guilford measures. The Guilford measure” gtr 
scored in accordance with standard scoring 
tions provided by the author of the tests.” 
ERa a 


he score? 


rd and 
* The authors are indebted to J.P. Guili w 
P. R. Merrifield for their assistance in prov nd 
unpublished tests and scoring instructions, 
Robert F. Berner for statistical advice. 


| 


Creative Problem Solving 


A gu a Creative Ability (AC). Only one item 
of tae I y of the AC Test was employed because 
Coat h imitation (listing all possible uses for a wire 
was a The scoring procedure for this test 
instead wee fo yield a quantity and quality score 
Foner ss quantity and uniqueness score. Each 
URES TE scored as indicating either good or bad 
ing me di he quality score was defined as compris- 
the res imensions: (a) uniqueness—degree to which 
tice er departed from the hanger’s conventional 
WaS a ) value—the degree to which the response 
diker ged to have social, economic, aesthetic, or 
usefulness, 
a Pel Bees was instructed to rate each response on 
value we for uniqueness and a 1 2 3 scale for 
good. he response was finally scored as indicating 
value quality if assigned a combined uniqueness and 
rea eos of at least 5. Final quality score used 
A oe e total number of “good quality” responses. 
ing) patie: which duplicated (in essential mean- 
the 5 esponses already given was eliminated from 
coring, 
teat = achievement. This modification of the TAT 
MeClen, scored according to directions published by 
107-1580 Atkinson, Clark, and Lowell (1953, pp. 
story ). The Originality measure was derived from 
ype Protocols obtained from the same four TAT 
ment cards utilized for deriving the Need Achieve- 
nality meer’ Previous studies employing the Origi- 
h Wren were based on a global appraisal by 
an atte (Barron, 1955). In the present investigation 
in prt was made to introduce greater objectivity 
Gareth by adopting a detailed rating method. A 
nalit Hs rating scale was utilized to define the Origi- 
eet limension on each of an S’s four stories: (a) 
me or bare story—one point; (b) story with 
Bouse eee of characters and/or plot—two 
indicati (c) elaborate story—three points; (d) story 
itn unusual amount of imaginative elabora- 
ae our points. An S’s total originality score was 
Cae of the points for all four stories. 
ne la Psychological Inventory. The CPI 
ra and Self Control Scales were scored ac- 
a ing to standard instructions provided by Gough * 
957). 


Sequence of Tests 


Tn designing the experiment 


of 
ay effect the sequence of t h 
ults. Test sequence was identical for the experi- 


thio group and Control Group A. The compari- 
n of the experimental group with Control Group 
= JoOWever, introduces an uncontrolled test sequence 
ariable. On the one hand, the experimental grouP 
aon the series of six ability tests prior to the 
Inistration of the three personality measures and 
ie 


y cognizance was taken 


he tests might have on 


a Pate 
the We wish to thank Harrison Gough EA pay 
i ad i two scales. Ll 
P, ge en) mer the Consulting 


to express acknowledgme ; 
oy Chologists Press, Inc., Palo California, for 
Mission to use the scales. 


nt to 
Alto, 


191 


the TAT Originality measure. On the other hand. 
Control Group B was administered the personality 
measures and the TAT Originality measure without 
prior administration of the series of ability tests. 

The primary experimental interest was in testing 
the effects of the creative problem solving course on 
abilities. The decision was therefore made to place 
six of the seven ability tests before the personality 
tests, thus leaving the comparison of the ability tests 
of the experimental group with Control Group A un- 
contaminated by the test sequence effect. A priori 
considerations suggested, moreover, that the ability 
tests were less likely to influence personality meas- 
ures than the converse arrangement. 


Results 


In order to control for possible differences 
in initial levels of performance, an analysis of 
covariance was employed for the evaluation of 
differences between experimental and control 
groups on all measures. Inspection of the 
data indicated that the regression was suffi- 
ciently linear to meet the assumptions of the 


covariance model. 

The calculation proce 
described by Edwards 
for a two-variable ana 


sign. 
Table 2 presents the comparison between 


the adjusted mean variances of experimental 
and control groups for the two measures of 
quantity of ideas. Inspection of the F ratios 
indicates both measures are significant beyond 


the 1% level. 

A similar comparison is depicted in Table 3 
for the five measures of quality of ideas. The 
results indicate that the AC Other Uses (qual- 


ity), and the Guilford Apparatus and Unusual 
Uses scores are significant beyond the 1% 
level. The Plot Titles High score just fails to 
reach the 5% level of significance. (Obtained 
F is 4.01; 4.02 is required for the 5% level.) 
The TAT Originality measure does not yield 
a significant difference. 

The comparison between experimental and 
control groups for the three personality meas- 
ures is presented in Table 4. The results in- 
dicate that the experimental as compared with 
the control group achieves a significant in- 
crease in Dominance. This comparison is sig- 
nificant at the 5% level. The results for the 
Need Achievement and Self Control variables 
indicate no significant differences. 


dure employed is that 
(1951, pp. 341-348) 
lysis of covariance de- 


192 


Arnold Meadow and Sidney J. Parnes 


Table 2 


Analysis of Covariance Between Pre-Post Differences of Matched Experimental and Control Groups 
tis Controlled for Initial Score Level—Quantity Creativity Measures 


Sum of Squares Mean g P 
Test Source of Variation Errors of Est. df Square F 
yi 012 53 

AC Other Uses Between groups plus error 1946. 5: 
Quantity Residual within groups (error) 1057.891 52 20.344 

Adjusted Means 888.121 1 888.121 43.655 <-01 
Guilford Plot Between groups plus error 4382.5275 53 
Titles Low Residual within groups (error) 3231.9973 52 62.154 

Adjusted Means 1150.5302 1 1150.5302 18.511 <.01 


Discussion 


The comparison between the Experimental 
and Control Group A indicated significant dif- 
ferences on both quantitative and qualitative 
measures of ability. On the two measures of 
idea quantity the experimental group attained 
a greater increase than the control group. 
This result suggests the conclusion that the 
creative problem solving students were utiliz- 


ing the course methods, even though the tests 
gave no instructions to do so. c 
Three of the quality measures (the A 
Other Uses—Quality, and the Guilford ae 
ratus and Unusual Uses tests) yielded highly 
significant differences. In evaluating the 1°- 
sults indicated by the AC Other Uses an4 
Guilford Unusual Uses scores, the specific je 
ture of the training employed in the a 
must be considered. The students did rece!¥ 


Table 3 


Analysis of Covariance Between Pre- 


Post Differences of M tched Experimental a ~ 1 Groups 
Controlled for Initi atched Experimental and Control 


ial Score Level—Quality Creativity Measures = 
asset 
g Sum of Square: ] 
Test Source of Variation Errors of Est. df Per g : 
AC Other Uses Between groups plus error 819.5780 53 
Quality Residual within groups (error) 352.0691 52 6.771 
; 1 
Adjusted Means 467.5089 1 467.5089 69.046 <” 
Guilford Between groups plus error 2603.3446 53 
Apparatus Residual within groups (error) 1466.6488 52 28.205 
z i ot 
Adjusted Means 1136.6958 1 136.6958 40301 $ 
Guilford Between groups plus error 1432.6361 53 
Unusual Uses Residual within groups (error) 795.3507 52 15.295 
i A A i o 
Adjusted Means 637.2854 1 637.2854 41.606“ 
Guilford Plot Between groups plus error 279.4798 53 
Titles High Residual within groups (error) 259.4284 52 4.989 
x ae f 7 3 
Adjusted Means 20.0514 1 20.0514 4.019 
TAT Between groups plus error 114.5780 53 
Originality Residual within groups (error) 111.2877 52 2.14 5 
tis ` ` 0 
Adjusted Means 3.2903 1 3.2903 1538 7 


Creative Problem Solving 


193 


Table 4 


Analysis of Covariance Between Pre-Post 
Controlled for Initial 


Differences of Matched Experimental and Control Groups 
Score Level—Personality Measures 


ac Sum of Squares Mean 
S Source of Variation Errors of Est. df Square F P 
T. 
a Need Between groups plus error 408.5023 53 
ievement Residual within groups (error) 401.8881 52 7.729 
Adjusted Means 6.6142 1 6.6142 856 >.05 
‘ 4 
a i Between groups plus error 550.0154 53 
minance Residual within groups (error) 500.1160 52 9.618 
Adjusted Means 49.8994 1 49.8994 5.188 <.05 
g 
ea c Between groups plus error 1160.8034 53 
ontrol Residual within groups (error) 1148.0450 52 22.078 
Adjusted Means 12.7584 1 12.7584 578 >.05 
The 


hen on the type of problem included on 
ne tests. However, since the instructors 
i ully avoided practice on any objects even 
otely similar to the type of objects which 
appeared on the tests, the results do indicate 
easing of this training. Results of the 
oo Test probably represent a greater 
gree of learning generalization inasmuch as 
gees designed to afford students practice 
i thinking of improvements for apparatus 
were deliberately excluded from training. 
Of the three personality measures, the CPI 
Dominance scale was the one measure which 
yielded a significant difference. This result 
indicated an increase in Dominance of the 
Experimental as compared with Control Group 
B (P< .05). This scale was devised by 
Gough “to assess factors of leadership ability, 
dominance, persistence, and social initiative. 
-_» . High scorers tend to be seen as: Aggres- 
sive, confident, persistent, and planful; as be- 
ing persuasive and verbally fluent; as self- 
reliant and independent; and as having lead- 
ership potential and initiative. Low scorers 
tend to be seen as: Retiring, inhibited, com- 
monplace, indifferent, silent and unassuming; 
as being slow in thought and action; as avoid- 
ing of situations of tension and decision; and 
as lacking in self-confidence” (Gough, 1957, 
P. 12). g 
It is interesting that 
One variable out of the three 


Dominance was the 
personality vari- 


ables which yielded a positive result. 
personality type it represents is the very type 
which the methods of the course were ex- 
plicitly designed to encourage- 


Summary 


The experiment was designed to evaluate 
the effects of a creative problem-solving course 
on creative abilities and selected personality 
variables. Three hypotheses were tested: the 
method employed in the course would produce 
a significant increment (4) in quantity of 
ideas, (b) in quality of ideas, and (c) in 
the three personality variables—need achieve- 
ment, dominance, and self-control. 

A battery of 10 test measures Was adminis- 
tered to matched experimental and control 
groups at the beginning and end of a creative 
problem solving course. The following re- 
sults were obtained: (a) The experimental as 
compared with the control group attained sig- 
nificant increments on the two measures of 
quantity of ideas; (b) the experimental as 
compared with the control group attained sig- 

ts on three out of five meas- 


nificant incremen 
ures of quality of ideas; (c) the experimental 


as compared with the control group showed 

a significant increment on the California Psy- 

chological Inventory Dominance scale. 
Results are interpreted to indicate that the 


creative problem-solving course produces a 
certain ability meas- 


significant increment on 


194 Arnold Meadow and Sidney J. Parnes 


ures associated with practical creativity and 
on the personality variable dominance. 


Received September 3, 1958. 


References 


Barron, F. The disposition towards originality. J. 
abnorm. soc. Psychol., 1955, 51, 478-485. 

Creative Education Foundation. Compendium of 
research on creative imagination. Buffalo, N. Y.: 
Author, 1958. 

Edwards, A. L. Experimental design in psychologi- 
cal research. New York: Rinehart, 1951. 

Gerry, R., DeVeau, L., & Chorness, M. A review of 
some recent research in the field of creativity and 
the examination of an experimental creativity 
workshop. ‘Training Analysis and Development 
Div., Lackland AFB, Texas, 1957. 

Gough, H. C. Manual for the California Psycho- 
logical Inventory. Palo Alto, Calif.: Consulting 
Psychologists Press, 1957. 

Lindzey, G. (Ed.) Handbook of social psychology. 
Cambridge, Mass.: Addison-Wesley, 1954. ; 
McClelland, D. C., Atkinson, J. W., Clark, R. A, 
& Lowell, E. L. The achievement motive. New 

York: Appleton-Century-Croits, 1953. 


Osborn, A. F. Applied imagination. New York: 
Scribner’s, 1957. 

Parnes, S. J. Description of the University of 
Buffalo Creative Problem Solving Course. Crea- 
tive Education Office, Univer. of Buffalo, 1958. 
(Mimeo.) 8 

Rapaport, D., Gill, M., & Schafer, R. Diagnostic 
psychological testing. Chicago: Chicago Yearbook 
Publishers, 1945-1946. 2 vols. 

Taylor, D. W., Berry, P. C., & Block, C. H. Does 
group participation when using brainstorming fa- 
cilitate or inhibit creative thinking. Dep. of In- 
dustrial Administration and Dep. of Psychol., Yale 
Univer., 1957. (Tech. Rep. No. 1, Contract Nont 
609(20) NR 150-166.) 

Taylor, D. W., & McNemar, Olga W. Problem solv- 
ing and thinking. Annu. Rev. Psychol, 1955, 6, 
455-482, 

True, G. H. Creativity as a function of idea fluency: 
practicability, and specific training. Dissertation 
Abstr., 1957, 17, 401-402. 

Wechsler, D. Manual for the Wechsler Adult In- 
telligence Scale. New York: Psychological Corp- 
1955. 

Youtz, R. P. Psychological background of prin- 
ciples and procedures in: Alex F, Osborn’s textboo 
entitled Applied Imagination, Buffalo: Creative 
Educ. Found., 1955, (Mimeo.) 


Journal of Applied Psy 
Vol. 43, woe. ied FSnCH OEY 


INCREASING PROBABILITY OF TARGET DETECTION 
WITH A MIRROR-IMAGE DISPLAY* 


C. H. BAKER 


Defence Research 


XS an earlier investigation (Baker, 1958) 
a analysis of the locations of targets de- 
cted in a radar-like task supported the hy- 
Ponesi that Ss tend to scan back and forth 
pe the revolving radial sweep-line when 
Searching for targets. Such a scanning tech- 
the inevitably results in visual coverage of 
Eo extreme ends of the line (corresponding 
e hy and maximum range on a PPI) 
ei is half that devoted to mid-portions of 
thet ine, and, indeed, the analysis indicated 
hind twice as many targets were detected be- 
line the mid-portions of the revolving sweep- 
the as were detected in the regions behind 
a extremes. By redesigning the display in 
ch a manner as to encourage a different 
ate of search it was shown that visual at- 
Ke a could be biased towards one extreme 
 sweep-line so as to increase the num- 

er of peripheral detections. 
Sas Present study was also e 
tar easing the probability of detection of 
m Bets appearing near locations representing 
aximum range, but with this difference: 


cre we were concerned with maximizing the 
ets by 


Probability of detection of such targ ) 
fn signing a radar-like display 0n which masi- 

um range was represented by center of the 
Sweep-line,? 


concerned with 


Apparatus 


ke he basic apparatus simul 
inch display known as a B 
verti square display of groun 
ri tical sweep-line which move | lef 
è ht across the front of the display six iine 
er minute, each sweep taking 10 seconds. 
ingle targets (bright spots of light one mm. 
ed 
Lapatis study constitutes Defen 
2p-ptAtories Report No. 107-65 
2N HR No. 163. 
invertin® that target detectability 
Play ing the range dimension on a $ 
ang genter represents maximum range—see 
apott (1958). 
argets were visually matche 


ated the type of 
-scan, i€., & 
d glass with a 


d from left to 


Research Medical 
CC No. D77-94- 


i t improved by 
: PPI so that dis- 
Hickson 


d in brightness but 


anp G. E. BOYES 


Medical Laboratories, Toronto, Canada 


in diameter) could be “painted on” the dis- 
play at any one of 49 locations by the sweep- 
line and left on for any desired period of time 
(see Fig. 1). It will be noted from Fig. 1 
that, as for a conventional B-scan, range and 
azimuth are represented by the vertical and 
horizontal dimensions, respectively, with maxi- 
mum range represented by the top of the dis- 
play and minimum by the bottom. 

The apparatus could be turned on its side 
to present the display shown in Fig. 2. Range 
was now represented by the horizontal dimen- 
sion, with minimum range represented by the 
left of the display, and maximum by the 
right. Such an arrangement encourages a 
lateral scanning motion. On the basis of con- 
clusions drawn in tracking studies (Fitts & 
Simon, 1952) one would anticipate superior 
scanning behavior in the lateral to that in the 
conventional vertical direction. 

A unique characteristic of the apparatus 
was that it could be “opened up” like a book 
so that the right half of the display was a 
lateral reversal of the left half (see Fig. DE 


MAXIMUM 


TARGET 


RANGE ———= 


SWEEP LINE 


MINIMUM 


AZIMUTH 
Fic. 1. Conventional display. 
A 5 
it was evident in some of the trials that matches were 
not perfect. 


195 


196 


MINIMUM RANGE —> MAXIMUM 


AZIMUTH 


Fic. 2. Horizontal display. 


Every target was now “painted” twice and 
maximum range was represented mid-way 
across the display. This arrangement has 
been termed the “mirror-image” display. 
Minimum range * could be represented by 
any vertical line on the display. If minimum 
were represented by the lateral boundaries 
AA shown in Fig. 3, the left half of the dis- 
play would be a duplicate of that in Fig. 2, 
while the right half would constitute a mirror- 
image. Under these conditions the display 
would be twice the area of that shown in 
Figs. 1 and 2. On the other hand, the mirror- 
image display could be equated in area with 
that in Figs. 1 and 2 by reducing the hori- 
zontal display dimension to one half its con- 
ventional value. Minimum range would now 
be represented by the dotted lines BB. Un- 
der this arrangement a target displayed at, 
say, 4 maximum range, would be physically 


located å of the distance from the dotted 


RANGE————___. a RANGE 
MINIMUM maximum 


MA MINIMUM 


AZIMUTH 


> 


8 A 


Fıc. 3. Horizontal mirror-image display. 


4 “Minimum” and “maximum” are not to be taken 
literally. Targets appearing at these represented 
ranges were from ¢ to 4 inch from display bounda- 
ries. “Half” range means half the physical distance 
between the “minimum” and “maximum” bounda- 
ries. 


C. H. Baker and G. E. Boyes 


lines to maximum, and so would not be at the 
same physical location on the basic display as 
in the case when minimum AA was employed. 

The mirror-image display could be placet 
on end, as shown in Fig. 4. Figures 1 to 
represent the four experimental conditions 
which were compared. 


Experiment 1 (Double Area Mirror-Image) 
Procedure 


Subjects sat alone in a semi-darkened booth a 
the display which was tilted 40 deg. back Prom The 
horizontal. Viewing distance was about 16 in. was 
task was to press a button whenever a tee be 
detected. They were told that targets mish! cep- 
“painted on” anywhere on the display by the ee ic 
line, and sample targets were shown. Targets, xt the 
persisted for one second, appeared at only 8 oat 
49 locations available, three at each of i oset 
half, and maximum range,* plus one randomly ¢ 
from the remaining locations in each quarter. 
same 13 locations were used in all four cond 
though, of course, when the sweep-line was Ve 


ition: 
tica 


MINIMUM AZIMUTH 


Ca RANGE 


MAXIMUM 


RENGE =—=——0 


MINIMUM 


Fic. 4. Vertical mirror-image disp!2¥- 


—$<<$<$—$ 
p e een... 
jä ———— 
EE 
—_— —-  —— 


Increasing Probability 


of Target Detection 197 


Table 1 


Showing the Number of Targets Missed Out of 144 Presented, at Each of the Three Represented 
Ranges for Each of Four Display Conditions 


; Conventional Horizontal Horizontal Vertical 

$ Range Display Display Mirror-Image Mirror-Image 

Represented (Fig. 1) (Fig. 2) (Fig. 3) (Fig. 4) 
Minimum 106 84 106 112 
Half 39 34 55 59 
Maximum 91 67 48 83 

Total 236 185 209 254 
Note.—Mirror-image displays were double the area of the other two. 


tey represented different ranges and azimuths than 
DE ìt was horizontal. Each target in any trial 
veo eight times, and 8 X 13, or 104 targets 
5 Sens in a different random order for each 
rai aaa Intervals between targets randomly 
te between 12 and 38 sec., the mean intertarget 
Pilea being 20 sec. A trial lasted 35 min. In this 
Gis a T minimum range was represented by AA 
Were a! and 4). Thus, the mirror-image displays 
Sabie the area of the other two. 
Tas jects were six laboratory personnel. Each S 
ie exposed to each of the four conditions in @ dif- 
nt order. 


Results 


ha ile 1 is shown the number of targets 
h escaped detection under each display 
ais dition at the nine locations representing 
minimum, half, and maximum range. From 
Table 1 it is apparent that in three of the 
‘ie conditions more targets were detected in 
Ocations representing half range than in those 
representing minimum or maximum range. 
aa exception was the horizontal mirror- 
Mage condition in which progressively fewer 
targets escaped detection as represented range 
increased. 

The 48 misses in the horizontal mirror- 


i “ye ne i 
Mage condition at maximum range were al 
target which was 


Most completely due to one } 

Missed 44 times, the other two at this range 
being missed four and zero times- This same 
target is responsible for an enlarged number 
of misses in the horizontal condition, the to- 
a of 67 being composed of 15, 9, and 43. In 
Ocation this target was at bottom center in 
pe former condition and bottom right IM the 
atter. Under the other two conditions this 
target was differently located and was not 
pissed a disproportionate number of times; 
Indicating a marked positional factor. 


The percentage of times each of the 13 
targets was missed was subjected to an arc sin 
transformation and an analysis of variance 
was done. The analysis indicated significant 
differences between displays. However, dif- 
ferences between target locations and subject 
interactions implied some subject inconsist- 
ency possibly due to the inadequacy of bright- 
ness matches noted above. 

In summary it can be stated that this ex- 
periment has demonstrated that the horizontal 
mirror-image display results in an increased 
probability of detection of targets at locations 
representing maximum range. 


Experiment 2 (Mirror-Image Display 
Equated for Area) 


In Experiment 2 the mirror-image displays (Figs. 


3 and 4) were equated for area with the remaining 
two (Figs. 1 and 2) by masking the two end quar- 
ters of the displays so that minimum range was rep- 
resented by BB, ie., in these displays range was rep- 
resented by half the display dimension employed in 


Table 2 
Analysis of Variance of ‘Transformed Data 
df SS MS 
Subjects (S) 7 18.4198 2.0314 
Targets (T) 11 57.3637 5.2149 
Conditions 3 4.7609 1.5870 
cT 33 28.3389  0.8588* 
ST 77 17.0507 0.2214 
SC 21 6.8451 0.3260 
SCT 231 34.1891 0.1480 
Total 166.9682 
0.1000 


Theoretical Residual 


* Significant at 01 level. 


198 


C. H. Baker and G. E. Boyes 


Table 3 
Showing the Number of Targets Missed Out of 240 Presented at Each of the Four Represented Ranges, 


for Each of Four Display Conditions. 


All Displays Were Equal in Area 


Conventional Horizontal Horizontal Vertical 
Range Display Display Mirror-Image Mirror-Image 

Represented (Fig. 1) (Fig. 2) (Fig. 3) (Fig. 4) 
Minimum Range 50 52 82 114 
1/3 Range 4 7 17 45 
2/3 Range 23 22 2 30 
Maximum Range 95 87 31 64 
Total Missed 172 168 132 253 

Percentage Missed 17.9 17.5 13.7 26.3 


Experiment 1. Only 12 targets were employed, three 
at each of minimum, 3, 3, and maximum range. 
Each target in any trial was exposed 10 times, the 
10 X 12, or 120 targets being randomly presented in 
a different order for each S and condition. Targets 
persisted for 0.6 second. Subjects were eight females 
from outside the laboratory who were paid. 


Results 


The percentages of target detections were 
transformed to radians by the arc sin trans- 
formation and an analysis of variance of the 
transformed values is shown in Table 2. From 
Table 2 it is apparent that the interaction of 
conditions and targets is significant, reflecting 
the fact that the locations at which targets 
were missed depended on the display. 

In Table 3 are shown the number of tar- 
gets which escaped detection under each con- 
dition at each of the four locations represent- 
ing minimum, J, 3, and maximum range. 
From Table 3 it is again apparent that under 
all conditions the greatest attention was de- 
voted to locations representing medium range, 
With respect to targets at locations represent- 
ing maximum Tange, most were missed under 
the conventional condition (95), and fewest 
were missed under the horizontal mirror-image 
condition (31). Again, the target missed 
most frequently (21 out of 31) was at the 
bottom center. This same target, at bottom 

right, was responsible for the 50 of the 87 
misses in the horizontal, and was not missed 
a disproportionate number of times on the 
other two displays. It is apparent that the 
horizontal mirror-image display was superior 
to all others at 2 range also. In summary it 
is clear that the horizontal mirror display was 


superior on two counts, (a) on the total on 
ber of targets detected, and (b) on the greater 
number of targets detected in regions aP 
proaching maximum range. 


Discussion and Conclusion 


The study has demonstrated that diy 
can be designed to capitalize on the fact ua 
Some portions of displays are given m s 
visual coverage than others. By designing 
display in such a manner that brief events hA 
greatest importance occur in the center of r 
area being searched, the probability of a 
events being detected is greater than if This 
occur in relatively peripheral regions. a- 
principle appears to hold particularly in a 
tions where lateral eye movements are i A 
volved. Vertical eye movements, where 
distance scanned is sufficient to require at 
movements too, was not found to result 
improvement in the probability of detect! 
of centrally located events. 


Received September 4, 1958. 


References 


Baker, C. H, Attention to visual displays d 
vigilance task. I. Biasing attention. Brit. 
chol, 1958, 49, 279-288. nt of 

Fitts, P. M, & Simon, C. W. The arrangeme and 
instruments, the distance between instruments | 
the position of instrument pointers as determi" ck 
of performance in an eye-hand coordination 
USAF WADC tech. Rep., 1952, No. 5832. Jay of 

Hickson, R. H, & Scott, D. M. Detectabili Y in- 
cathode ray tube screens: Comparison of P noise 
verted PPI, and B-scan with noise and 
free conditions, Defence Res. Board of 
DRML Rep., 1958, No, 163-15. 


uring * 
JPI 


| 
| 


| 


Journal à 
Vol. A ar ee Beetalegy 


A FACTORIAL STUDY OF DEXTERITY TESTS’ 


G. LEE BOURASSA 


Allis Chalmers Manufacturing Company 


anp ROBERT M. GUION 


Bowling Green State University 


fee of the transistor and other in- 
brings ys got involving very small items 
very fine need for a better understanding of 
research a work. Factor analytic 
ahd fine as suggested that hand dexterity 
E may be separate abilities, 
Work at able in terms of the fineness of the 
1953: FI ormed (Dvorak, 1947; Fleishman, 
195]. ee & Hemple, 19544; French, 
in de emple & Fleishman, 1955). 

lection eee a test to be used for the se- 

ue a assemblers, the junior au- 

Byres e several observations pertinent to 
ivoire i study: (a) the muscle movements 
and sho in performance of this job were brief 
very sm Aa total span being measurable in 
Parts eh fractions of inches; (b) even with 
manipula enough to hold in the fingers, all 
to avoj aas had to be made with tweezers 
oils; (c contamination of material from finger 
Predict ) tests of visual skills were significant 
a sient of performance; and (d) there was 
eateriey correlation between the tweezer 
ception, test developed and the depth per- 
rater test of the Bausch and Lomb Ortho- 


nS present study is built around two hy- 
(a) then stemming from these observations: 
Can a is a factor of psychomotor skill that 
mae’ Called weezer dexterity, the ability to 
Arise apid and controlled manipulations with 
identifi that is different from previously 
Critic ed factors of manual or finger dex- 
Visual” and (b) the relationship between 
bliqu factors and psychomotor factors is 
has rather than orthogonal. 

or analytic literature offers ver 


info y little 
mation about either of these hypotheses. 


1 
This ; 
Unde, US 18 a report of research done by, Fettment 
n partial 


er 
OF the the supervision of Guion i 
at Bowo Uirements of the degree of Master of Arts 
mesis Hine Green State University. The origina 
M the hal which this article is taken is deposite! 
U library. 


Dexterity tests requiring tweezers have been 
included in psychomotor batteries for factor 
analysis, but not in large enough numbers to 
identify them as defining a distinct factor. 
Moreover, visual tests have not been included 
in psychomotor batteries. Factor analyses of 
visual skills are not common, but the analysis 
of Orthorater scores by Zachert (1951) does 
suggest that depth perception is distinct from 
simple visual acuity. 

Consideration of the literature in this do- 
main suggests one methodological weakness or 
flaw in previous research which should be 
avoided. It appears that previous investiga- 
tors have used the same order of testing for 
all Ss in a given study. Although Fleishman 
and Hemple (1954) have pointed out the ef- 
fect of practice on factor structure of a task, 
and although taking one psychomotor test 
may be considered practice for another, only 
a study by Fleishman (1953) reports any at- 
tempt to vary the administration order of test 
variables, and this was merely a reversal of 


order. 


The Test Battery 


selected or constructed on the 
five factors: (4) manual dex- 
terity, previously identifed, the ability to 
make skillful, controlled arm and hand ma- 
nipulations at a rapid rate; (b) finger des- 
terity, previously identified, the ability to 
make skillful, controlled manipulations with 

e; (c) tweezer dex- 


the fingers at a rapid rati 
terity, hypothesized, the ability to make skill- 
ful, controlled manipulations with tweezers at 
a rapid rate; (d) tentatively identifed, visual 
acuity, the ability to perceive fine visual 
stimuli; and (e) tentatively identifed, depth 
perception, the ability to perceive differences 
in distances of stimuli. 

The following list identifies the tests used 
rms of the factors each was assumed to 


Tests were 
assumption of 


in te 


199 


200 


identify. Reference tests used in previous 
studies will be identified by an (R) after the 
name of the test; tests identified by a (C) 
after the name of the test were constructed 
specifically for this study.* 


Manual Dexterity 


1. Minnesota Rate of Manipulation—Turn- 
ing (R). The score is the number of blocks 
turned in two 35-second trials.* 

2. Minnesota Rate of Manipulation—Plac- 
ing (R). The score is the number of blocks 
placed in two 40-second trials. 

3. Dowel Manipulation (C). This test 
consists of a 6” X 18” board with 32 holes 
(four rows of eight) ;% inch in diameter. 
When a 13” length of 4” dowel is inserted into 
a hole, half of it protrudes. Dowels are in 
each hole, and the S removes a dowel with 
one hand and returns it to the hole, reversed, 
with the other hand. The task differs from 
the first test only in the size of the apparatus. 
Since it involves smaller pieces than the Min- 
nesota test, it was assumed that the loading 
for manual dexterity would be lower, although 
still significant since arm movement is in- 
volved. A finger dexterity loading was also 
anticipated. The score is the number of 
dowels inverted in two 25-second trials. 


Finger Dexterity 


4. Purdue Pegboard—Nonpreferred Hand 
(R). The score is the number of pegs placed 
in two 30-second trials, 

5. Purdue Pegboard—Both Hands (R). 
The score is the number of pins pl 
two 30-second trials. 


6. O'Connor Finger Dexterity (R). The 


score is the number of pins placed in two 2- 
minute trials. 


7. Placing, Finger (C). The test board 
contains 32 small washers (four rows of 
eight), inside diameters about 5”, into which 
the S places small pellets of shot (calibre 118 


2Complete descriptions and 
tests developed for this study 
original thesis. 

? Testing times are decided upon on the basis of 
pilot runs with varying numbers of Ss. It was de- 
sired to use the smallest time possible consistent with 
the need for reasonable reliability, in order to avoid 
over-fatigue of subjects. 


aced in 


Photographs of the 
are included in the 


G. L. Bourassa and R. M. Guion 


B-B’s). From a tray at the top of the hoard, 
the S takes a pellet and places it in the ae 
of the washer. The score is the number © 
pellets placed in two 30-second trials. 


Tweezer Dexterity 

8. O’Connor Tweezer Dexterity Test- be 
score is the number of pins placed, bY 
tweezers, in two 90-second trials. a 

9. Pin Moving (C). The test board ase 
tains two round trays, placed six inches na 
containing common pins with the heads ae 
moved. With tweezers the S moves pins pi 
at a time from a full tray to any empty pe 
The score is the number of pins moved in 
60-second trials. ay Test: 

10. Bowling Green Tweezer Dexterity ec- 
This test was developed specifically for Sè A 
tion of transistor assemblers; it consists p 
placing small plastic discs (diameter Ip 0 
into holes in a brass plate with the DEP is 
tweezers. There are eight rows of 12 : the 
in the plate; a small bowl at the top * isc 
test board contains approximately 200 d in 
The score is the number of discs place 
two 60-second trials. 

11. Placing, Tweezer (C). This tes 
lizes the same apparatus as Test 7, excep 
the pellets are to be handled with yer 
rather than fingers. The score is hene 
of pellets placed in two 30-second trials- 


t uti- 
t that 
ezer 


Depth Perception 


he 
12. Orthorater Depth Perception (R): : e 
score is the number of correct successiv 
sponses on two trials. ws 9 
13. Depth Perception (C). Eight T° are 
four white discs (one inch in diameter 
mounted on dowels projecting approx!” 
two inches from a black background. 
row, one disc extends further (i.e. iS © iffe 
the S) than the others. This distance ses PY 
ence is }” in the first row, and derr nti 
decrements of 34” in the following row eight 
there is a difference of only 4” in the ¥ {jsc5 
row. The S is to identify the displace Since 
(primarily from binocular disparity: -ini 
light and shadow cues are reduced to # qhe 
mum) from a distance of eight m re 
score is the number of correct succes? 


sponses on two trials. 


| 


- 


Factorial Study of Dexterity Tests 


Visual Acuity 


ee Near Acuity—Left Eye (R). 
Silas is the number of correct successive 
+ ses on two trials. 
(R) ; Sea Near Acuity—Right Eye 
oe j e score is the number of correct suc- 
esponses on two trials. 


Testing Procedure 


ete were 100 women volunteers, enrolled in 
18 to 23 “ae psychology classes. Ages ranged from 
jected ‘but hree Ss with monocular vision were re- 
the seni were replaced. All testing was done by 
Two Graikai: 
E a rooms were used. They were win- 
controlled, EA x 11', painted black; light was well 
visual tasks ALY tasks were given in one room, 
be seated T the other. Tests requiring the S to 
those a i given at an ordinary office desk; 
table, iring a standing position were given at a 
nun table of random numbers, 10 orders of 
tests and th WENS developed for the psychomotor 
deck of 10 for the visual tasks. For each room a 
quence 10 i cards were prepared, listing each se- 
S was giv imes, and shuffled. The sequence for each 
the test ue on the card on top when she entered 
tests; nate Half of the Ss started with vision 
visual test with dexterity. Has started with the 
Were com S, retesting was done after all motor tests 
tests wer plete; if the vision tests were last, all four 
same or e given once, with retesting following m the 
P rder, 

puted for product moment correlations 
Ministr a the matrix, using the sum of the two ad- 
ability « ions of each test as the raw score. Reli- 
wo ieee were computed by correlating the 
ya pendent administraions and then correcting 

Pplying the Spearman-Brown prophecy formula. 
he analysis was Thur- 


Stone’. u 
œs centroid method, as described 


were COM- 


suffici 
ient factor extraction, in accordance 
the side of extract- 


n too few. These 
Humphrey’s Rule, and 
tracted and 


structure and positive 


the writers’ 


See Criterion. 
manifold. approximate simple 
intention The data (perhaps unaware of 
izoni to deal with oblique factors 
otation rotation, which was done graphically 
s were needed to approximate criterion. 


Results 


Tanne original centroid loadings 
fth 1, with the rotated loading 
a Bod five factors, one appears 
Mencia one a triplet; three 
cation, In this discussi 


are shown in 
s in Table Z 
to be merely 
would justify 
on, test vari- 


201 


ables with significant loadings (.30) on each 
factor will be presented; test variables that 
have their highest loading on the factor will 
be identifed by an asterisk (+): ` 
Factor I is identified as Manual Dexterity 
and has significant loadings on the following 


tests: 


Test 


No. Test Loading 
2 Minnesota Rate of Manipulation, .73* 
Placing 
4 Purdue Pegboard, Nonpreferred .68* 
Hand 
$ Purdue Pegboard, Both Hands .63* 
1 Minnesota Rate of Manipulation, .61* 
Turning 
3 Dowel Manipulation .60* 
6 O’Connor Finger Dexterity .50* 
0 Bowling Green Tweezer Dexterity 37 
11 Placing, Tweezer a 37 
32 


Pin Moving 


The identification of this factor is based 
primarily on Tests 2, 4, 5, 1, and 3. The 
loadings on these tests are more than ade- 
quate, and this factor accounts for the major 
portion of the explained variance on each of 
these tests. The visual tests have loadings of 
near zero on this factor, indicating rather 
clearly that visual skill is not involved. 

On each of the first five tests, a major char- 
acteristic is the movement of arm and hand. 
The other tests may also involve some fairly 
gross arm movement, although it is not so 
marked. Tests 10, 11, and 9, of course, have 
at least one other equal or higher loading on 
another factor. 

Factor I, Visual 
by the four vision tests: 


Sensitivity, is identified 


Test Y 
No. Test Loading 
15 Orthorater, Near Acuity a 
14 Orthorater, Near Acuity .62* 
12 Orthorater, Depth Perception 5" 

Sof 


13 Depth Perception 


Other studies have reported more refined 


visual factors; Zachert (1951) reported the 
two acuity tests as having high loadings on a 
factor labeled “Acuity,” but included the Or- 
n Perception Test in her re- 


thorater Deptt 
sults as & specific factor having no significant 


loading on Acuity. Rabideau (1955) identi- 
fied two more kinds of acuity factors. In this 


202 


G. L. Bourassa and R. M. Guion 


Table 1 


Centroid Factor Loadings 


h? 
Variable I I 1I Iv vV B 

Be a 49 

1. Minnesota, Turning 59 27 —.13 .04 E ‘6 

2. Minnesota, Placing 68 27 —.23 -03 ae ‘00 

3. Dowel Manipulation -67 31 .12 —.09 ae ; 51 

4. Purdue Pegboard (N) 54 AS —.31 — 28 =A 45 

5. Purdue Pegboard (B) -50 28 —.30 —.15 -08 ‘48 

6. O’Connor Finger -62 29 08 —.09 .05 ot 

7. Placing, Finger 39 .15 —17 .06 -06 37 

8. O'Connor, Tweezer 46 04 32 —.23 —.03 36 

9. Pin Moving .52 .07 18 16 —.17 16 

10. Bowling Green Tweezer -56 16 1S —.20 21 37 

11. Placing, Tweezer -46 .07 .30 .18 =t 37 

12. Depth, Orthorater 34 —44 —.18 10 AS “40 

13. Depth Perception 40 —.42 —.03 .14 -20 ‘0 

14. Near Acuity, Left Eye 37 —.60 12 3 .20 ST 
15. Near Acuity, Right Eye 46 —.57 .07 13 12 ee 
fol- 

study, however, no such distinction could be he 


made; accordingly we have chosen an identi- 
fying label (visual Sensitivity) that is de- 
liberately inclusive, 

Factor III had only one test (O’Connor 
Tweezer Dexterity) with a loading greater 
than .30. It is considered a residual, 


Factor IV is a triplet which we have not 
identified: 


Test 
No. Test Loading 
9 Pin Moving .41* 
11 Placing, Tweezer 37 
3 Dowel Manipulation 33 


It might be suggested th 


same type of muscle control, 


this factor would be too tentative to be 
justified. 

Factor V js tentatively defined as Visual 
Feedb, 


ack; there appears to be no Precedent 


for it in factor analytic literature. 


. a i on 
lowing tests have Significant loadings 
factor; 


this 


Test Loading 
No. Test ,53* 
10 Bowling Green Tweezer Dexterity AB 
6 O’Connor Finger Dexterity An 
8 O'Connor Tweezer Dexterity 40 
14 Orthorater, Near Acuity—Left 
Eye 36 
15 Orthorater, Near Acuity—Right 
Eye 34 
13 Depth Perception 33 
3 Dowel Manipulation Jt 


9 Pin Moving define’ 
The interpretation of this factor pe 


Visual Feedback as the ability to weyacin’ 
visual cues in the manipulation ant E with 
of small objects, The dexterity Oe plac 
high loadings on this factor require th® © zp- 
ing of small Pins or discs in holes. ulati?” 
pears that more than skillful manipa this 
with tweezers or fingers is necessary erpi 
Operation; it is also necessary tO hile per 
Correctly the sensory cues received be pman? 
forming the task and to adjust per hree H 
accordingly. The high loadings of t vist, 
Sion tests on this factor indicate that 4 visu 
ability is involved. The nature of bis o 
ability may be related to the poe em 
Pins or discs before the actual placing © pilit” 
in the holes, Apparently some visu@ 


pa 


pae 


Factorial Study of Dexterity Tests 


203 


Table 2 
Rotated Factor Loadings 
Variable I I mi IV vV h? foe 
1. Minnesota, Turning 61 5 = 
A sais Placing 73 oe a T R a A 
A Ra Manipulation .60 —.04 14 33 33 ‘60 = 
i Purdue Pegboard (N) 68 17 13 03 01 ‘31 ‘80 
Busine Pegboard (B) a o iT “ORT 
6: O'Connor Finger en O u 8 PE AS 
i Placing, Finger S o G A A 2 49 
a ns onnor, ‘Tweezer .22 .05 34 AZ 42 .37 85 
10: Ss Navi 32 10 .00 Al 31 38 68 
ta. oe Green Tweezer 37 04 .20 00 53 46 7 
ih Racing Tweezer 37 03 28 37 15 38 82 
13. ig Orthorater Bt) 57 12 04 08 38 88 
n Depth Perception ‘00 ee) 05 34 AL ‘ST 
Pei eae ies Left Eye —.19 62 —.08 14 40 Ol 90 
— 08 65 ‘00 16 36 58 ‘86 


. Near Acuity, Right Eye 


* Reliabili 
cliability estimates are first-half, Jast-half correlations, corre 


i ‘ 
hn for adequate performance because 
mani extreme smallness of the objects being 
i pulated. 
ne cues are probably involved 
aps in act in conjunction with vision; per- 
be sub cases of visual impairment these could 
sory ¢ stituted for visual cues. If other sen- 
he de ues are involved, then this factor could 
sory eo by the more general term “Sen- 
seem eedback.” The more restricted term 
Tin : preferable, however. For example, the 
Tete faa. and Purdue test series had very 
tive] loadings on this factor. These are rela- 
hed gross tests; they do not seem to To 
seem such fine ‘visual discrimination. It 
feed; reasonable to suggest that any sensory 
vol ack involved in these tests would in- 
ree skin senses more than vision. 
Renee also possible that this factor may be 
nati ified as eye-hand coordination—a desig- 
es conspicuous for its absence from the 
eb analytic literature. Such a designa- 
test: is not justified, howevel since the visual 
tivit, of this battery involve no motor ac- 
PERA Fleishman (1953, 1954) identified a 
tien which is based on eye-hand coordina- 
thet and which he called Aiming Ability- This 
Ca he found only with paper and pencil 
ing , but a companion factor called Position- 
titl was found with apparatus tests. This 
e might be appropriate for the present fac- 


cted for full length by the Spearman-Brown formula. 


tor. The use of the same title might, how- 
ever, imply a matching of the two factors, 
implication would be most un- 


and such an 
justified, since Fleishman’s Positioning factor 
had a high loading on the Minnesota Placing 


Test. Clearly, the naming of this factor is 
tentative, but further research seems indi- 


cated. 
Discussion 


The hypotheses upon which the present re- 
search was based have not been supported, 
ns for additional study 


but important questio. 
have been raised. Apparently, additional ex- 
plained variance in psychomotor skills will 
not be found by the simple expedient of using 
finer movements Or extra tools. Not only 
did the hypothesized tweezer dexterity factor 
fail to appear, but also the finger dexterity 
factor, accepted as previously established, was 
not found. Reference tests for such a factor 
were included here among the tests identify- 
ing the manual dexterity factor. An example 
of such a shift is the Purdue Pegboard. These 
tests appear to involve as a predominant fea- 
ture an increasingly rapid and accurate move- 
ment of the entire arm; finer finger and wrist 
movements are involved, but to perhaps a 
lesser degree. 
Apparently 
least for stan 


it can be concluded that, at 
dard tests of dexterity, even 


204 


the finest tweezer or finger test will have a 
significant amount of its variance attributed 
to the ability to make rapid arm movements. 
If this is true, then factor analytic studies 
designed to test the hypothesis of a separate 
factor for these more restricted movements 
would need to provide tasks in which gross 
arm movements are either eliminated or 
greatly reduced. 

~ Another possible explanation of the failure 
to identify a finger dexterity factor may be 
methodological. Generally, previous research 
seems to have used a standardized order of 
testing. In such a case, it appears possible 
that many correlations obtained might be 
artifacts of practice or fatigue. Moreover, 
test variables requiring similar responses can 
provide a set for succeeding tests; in this 
manner the practice effect could be consider- 
able. In other words, it can be suggested as 
a hypothesis for research that order of test 
administration may influence the factorial 
structure of a test battery. If this hypothe- 
sis is tenable, then previously accepted fac- 
tors of psychomotor ability might warrant 
re-evaluation. 

Factor V, tentatively named “Visual Feed- 
back,” accounts for much of the variance for 
the finer, more restricted tests that was not 
accounted for by the manual dexterity fac- 
tor. Perhaps previous studies have found this 
factor but have not identified it as such be- 

in the bat- 
It seems possible that such feedback 
nd dexterity 
It seems plau- 


still more of the total variance. 

eration of this possibility raises t 
question of whether much of t 
unexplained variance in psychomotor tasks 
could be accounted for by relationships that 
exist between factors in divergent domains. 
Future research that investigates possible re- 
lationships between different factor domains 
may find additional factors, such as the 
Visual Feedback factor that apparently does 
not belong in any one domain but appears to 
lie along the border of previously distinct 
domains. 


he important 
he presently 


G. L. Bourassa and R. M. Guion 


Summary and Conclusions 


This study has been concerned with the 
factor analysis of a battery of dexterity ech 
and vision tests in an effort (a) to identify 
a “tweezer dexterity” factor and (b) to de- 
termine the relationship between fine dexteti- 
ties and visual skills—particularly depth per- 
ception. The centroid method was used ee 
the five factors extracted were rotated r 
thogonally. Three of the factors ge 
were identified: (a) manual dexterity, t 
ability to make skillful and rapid a 
movements, (b) visual sensitivity, the abili y 
to make fine visual discriminations iana 
acuity and/or depth perception, and = 
visual feedback, the ability to use fine n 
cues in the manipulation and placing of Shek 
objects. The identified factors include neit oi 
the tweezer dexterity factor hypothesized, OF 
the separate finger dexterity factor foun 
Previous studies with the same tests. 


Received September 10, 1958. 


References 
Cattell, R. B. Factor analysis, New York: Harp 
1952. titude 
Dvorak, Beatrice. The new USES general Hs 76. 
test battery, J, appl. Psychol., 1947, 30, 


ery 


ilities 

Fleishman, E. A. Testing for psychomotor AP oss, 
by means of apparatus tests. Psychol. Bulls 

50, 241-262, otor 


Fleishman, E. A, A factorial study of psycho ke 
abilities. USAF Personnel Train. Res. Cent., 
land AFB, San Antonio, Texas, May 1954. ges In 

Fleishman, E. A., & Hemple, W. E., Jr. cni a5 
factor structure of a complex psychomotor sy 1 
a function of Practice. Psychometrika, 199% 
239-251. (a) 

Fleishman, E. A. & Hemple, W. E. Jr. 
analysis of complex psychomotor per d 
USAF Personnel Train. Res, Cent., Bache 
San Antonio, Texas, April 1954, 54-12. OH 

French, J. W. The description of le ors: 
achievement tests in terms of rotated # 
Psychometr, Monogr., 1951, No. 5. j 

Fruchter, B, Introduction to factor analysis- 
York: Van Nostrand, 1954. analy” 

Hemple, W. E, & Fleishman, E. A. A facto i skill- 
sis of physical proficiency and manipulativ 
J. appl. Psychol., 1955, 39, 17-19. apy meas” 

Rabideau, G. F, Differences in visual acuity "ots. 
urements obtained with different types vie 
Psychol. Monogr., 1955, 69, No. 10 (W 
395). sion test 

Zachert, Virginia. A factor analysis of NEEN 951s 
Amer. J. Optom. Arch. Amer. Acad. Optom» 

38, 405-416, 


0 


Factorial 


nce- 
form FB, 


New 


—— a 
nn ee ee ee 
— S 

— n ——_____—_. 


‘Journal of Applied Psy 
Vale 43, No. S959 0E 


INTERACTIONS BETWEEN DISPLAY GAIN AND 
TASK-INDUCED STRESS IN MANUAL 
TRACKING* 


W. D. GARVEY axp JEAN B. HENSON 


Naval Research Laboratory 


It is important for the human engineer to 
know how the performance of man-machine 
Systems is affected by factors which stress or 
Overload the human operator. In an earlier 
study (Garvey, 1957) it was found that the 
order of merit of different tracking systems, 
Produced by varying the nature of the con- 
trol dynamics, remained unchanged when the 
operator was subjected to various forms of 

task-induced stress.” ‘The present study ex- 

tends the investigation to systems, all of 
Which have the same control dynamics, but 
Which differ in display magnification. 


Apparatus and Procedure 


w compensatory position control tracking device 
chart employed through the experiment. A flow 
SS t of the tracking device is shown in Fig. 1. The 
line ask was to attempt to hold a dot on the center- 
st Of a 5-in, CRT by manipulating a spring-re- 
tained control stick with the right hand. The tar- 
Bet dot was continuously forced off center along the 
aanta by a complex sine wave generator which 
Urnished frequencies of 11, 7, and 3 cycles per min., 
fe amplitudes being inversely proportional to the 
co quency, Any one of five display gain settings 
uld be selected through the positioning ofa switch. 
ae gains and associated maximum target excursions 
t shown in Table 1. The system was 50 adjusted 
Bh with an arbitrary gain setting of 0.15, 10 ounces 
ü orce applied to the joy stick moved the dot 1 in. 
t creased magnifications were achieved by increasing 
ti © gain in steps up to 1.00, the highest magnifica- 
on employed. 
ister types of performance M 
Bie T error and display error. 
1, system error is a measure 
hich the output of a tracking system follows the 
Jout. It is obtained by electronically integrating 
tS continuous difference jn voltage between the sys- 
™ input and system output without regard to sign. 
Pela error, the system error multiplied by oe ‘a 
in a gain constant, is the input signal to Awake | 
the ect, a measure of the error seen by ee 
—< ‘ror at the display could always be comP 
d to F. V. Taylor 
on of the results 
aration of the 


easures were taken, 
As may be seen in 
of the extent to 


Tea 
for The authors are greatly indebted 
is assistance in the interpretati 


from the system score, a separate electronic integra- 
tor was also used in the study to furnish a direct 
measure of display error. 

Each tracking trial was 1 min. in duration and 
performance during only the last 55 sec. of the trial 
was scored. The trackers were five Naval enlisted 
men, The general experimental design consisted of 
five Ss, five display gain settings, and five 1-min. 
trials. The design was permuted from session to 
session so that each S had four trials per day with 
each magnification. At the end of the seventh ses- 
sion (ie. after 28 practice trials with each gain) 
Ss were required to operate under a series of con- 
ditions intended to degrade performance. In accord- 
ance with previous practice (Lazarus, Deese, & Osler, 
1952) these will be referred to as types of task-in- 
duced stress. 

Each stress cond 
day with a different 
nifications and Ss. 


ition was presented on a separate 
counterbalancing of display mag- 
The Ss were always given one 
trial without stress on the particular magnification 
level they were to experience next under stress. A 
careful explanation was then made as to how each 
stress mode of operation differed from that of the 
training sessions. The stress trials consisted of one 
1-min. trial with each gain setting under each of the 
following conditions: = 
Secondary arithmetic task. 
to solve two-digit mental subtrac! 
rapid pace, while tracking. > 
Incompatible display-control relation. The Ss per- 
formed the tracking task with a display-control re- 


The Ss were required 
tion problems at a 


Table 1 


Magnification and Maximum Target Excursion 
for Each Display Gain 


Maximum 

i 7 Target 
Dipsy Magnification Excursion 

X 1.0 2.0 in. 

x17 3.4 in. 

x21 4.2 in. 

x 3.3 6.7 in. 

X 6.7 13.4 in. 


te description of these stress 


2For a more comple 
z (1957). 


conditions, see Garvey 


205 


206 


SYSTEM 


Gains] 


Fic. 


lationship which was the reverse of that on which 
they were trained. 

Two-hand tracking. The Ss were required to track 
two dots simultaneously, one with a right-hand con- 
trol and one with a left-hand control. The left-hand 
system was identical with the right-hand system; the 
two dots followed the same course input, which was 
that employed during the training sessions. Only the 
performance of the right-hand system was used in 
analyzing the results. 

Two-coordinate tracking. In this condition the 
dot moved and was tracked in both the vertical and 
horizontal coordinates. The dot was controlled with 
the right stick only, The course input to the sys- 
tem was identical with that used during training 
except its path of movement was rotated 45 deg. 
from the horizontal axis, Only the performance of 
the system in the horizontal coordinate was used in 
analyzing the results. 

Secondary visual task. In addition to the tracking 
task employed during training, Ss were required to 
perform simultaneously a second task which con- 
sisted in detecting and reporting range and bearing 
of targets on a simulated radar scope. 


Results è 


After ap- 


n the per- 
formance with all magnifications appeared to 
have leveled off, The pooled data from the 


last five of the 28 training trials are plotted 
in Fig. 2. The solid lines show the effect 
of magnification on system error in the left 
graph and on display error in the right graph, 
Increasing magnification reduces system error 
and increases display error ( < .001). 
Effect of stress. The performance under 
task-induced stress is also shown in Fig. 2, 


3 The statistical methods employed in the analysis 
of these data were nonparametric tests developed by 
Wilcoxon (1949). Case I (Wilcoxon, 1949, p. 4) 
was used for determining the significance of the dif- 
ference between two conditions, and Case IHI (Wil- 
coxon, 1949, p. 6) was used for the comparison of 
several conditions. 


W. D. Garvey and Jean B. Henson 


1. Simplified block diagram of manual tracking system. 


In the dotted-line graphs the data from n 
five stress conditions are pooled. It may all 
seen that the effect of stress is to increase n- 
error scores, but to leave the rank order of ORE 
ditions unchanged. There is a tendency At 
the absolute differences between stregsed Sa 
unstressed performance to increase going this 
low to high magnification, Although tem 
trend is not significant (p > .05) for sys 10 
error scores, it is significant with display €" 
($ < 01). of 

Figure 3 shows the median percentage o 
deterioration in performance under stre a 
each of the stress conditions. The perm rom 
of deterioration may be calculated either al 
system-error or display-error scores; the ame 
culated quantity is mathematically the ‘ion: 
whichever source is used for the calcu ae 
In general, it may be seen that the high ete- 
play-magnification systems show greater can 
tioration. These functions are all signilo ý 
(b < .05) with the exception of the How- 
metic-task condition (10 > p > .08). Hoy, 
ever, even in this case the deterioration a 
highest magnification was significantly grei 
(p < .01) than that with the lowest mag 
cation system. 


Discussion 


at 
The effect of magnification. The fact 5 - 
increasing display gain improves syste agel: 
formance has led investigators (Battig, 195 ; 
& Brogden, 1955; Bowen & Chernikoff, orm” 
Helson, 1949) to infer that operator p cê: 
ance is improved. However, perforta the 
judged in terms of the error displayed ay 
operator, is shown to deteriorate aS dis 
gain was increased, Thus, in terms i be 
play error, an opposite conclusion me pest 
reached about “operator performance. 


—_————— 


Display Gain and Task-Induced Stress 


e UNSTRESSED 
s— ——* STRESSED 


"i0 


90 


80 


70 


60 


MEDI 
IAN INTEGRATED SYSTEM ERROR (ARBITRARY UNITS) 


50 


04 06 08 10 


0.2 5 
DISPLAY GAIN 


Median integrated error as & 


Fic. 2. 
ce without st 


curves represent performan 
with stress. System error 


right graph. 


monstrate that scores 
hine system are not 
e response of 
m. 

cking device 


ra ering conclusions de 
alwa Te from.aamahaee 
the a a valid indicant of th 
eo operator in the syste! 
with e eleni of stress. If the trackin 
Cisne of the different gain settings em- 
constit in the present study is regarded as 
the ba a different man-machine system, 
experi ults are in agreement with the previous 
hat mee (Garvey, 1957) in demonstrating 
Ft e order of system merit is not changed 
Marn he particular forms of stress. With 
ee cation as the system variable, the sys- 
or anng the better scores in the absence 
T remain the better systems, during 
fest s. This is of considerable practical im- 
‘a to the engineering psychologist, al- 
sea generalization to all varieties of sys- 

It is certainly not yet warranted. 
is clear from the results that the effects 
Stes may be differently interpreted from 
est peg measures of performance, i.e., the 
em tacking performance (in terms of sys- 
ie o) and the poorest tracking perform- 
(in terms of display error), show the 


MEDIA! 
N INTEGRATED DISPLAY ERROR (ARBITRARY UNITS) 


is shown in the left 


207 


90; 


-——_ UNSTRESSED 
e———* STRESSED 


80) 


70| 


60| 


50] 


40) 


20 


10 


oe 


oa 06 
DISPLAY GAIN 

gain, The solid 
e represents that 
Jay error in the 


o 02 
function of display 


ress; the dotted curv 
graph and disp! 


tion under stress. These re- 
to point out the pitfalls of 
human performance with 
iments conducted with 


greatest deteriora 
sults serve further 
directly equating 
tracking scores in exper 
man-machine systems. 

But this is not to say 


about the human can be made. 
of the present experiment lead rather di- 


rectly to the hypothesis that stress, of the 
type employed in this study, will cause the 
greatest degradation in the performance of 
those position control systems which most tax 
the human operator. Tt is reasonable to sup- 
pose that the greater the error S perceives, the 
more effort he expends in attempting to re- 
duce it and the less of his capacity remains 
available for simultaneously contending with 
secondary tasks or other circumstances which 
also demand his attention. In this situation, 
effort devoted to dealing with intrusions of 
any sort must be subtracted from that previ- 
ously fully accessible to the tracking task, 
with the result that the latter suffers to the 
extent that the needed capacity is no longer 


available. 


that no inferences 
The results 


208 


200 


180 


160;— 


140 


120 


@—* ARITHMETIC 
100;-— O-——o INCOMP D-C 
4&—6 2 HAND 
4—~ 2 COORDINATE 
o—o VISUAL TASK 


8oj— 


CE 
MEDIAN PERCENT DETERIORATION IN PERFORMAN 


02 0.4 06 08 1.0 
DISPLAY GAIN 

Fic. 3. Median percentage of deterioration in per- 
formance under conditions of stress as a function of 


display gain. The five curves represent the different 
stress conditions employed. 


The finding of the present study, that stress 
Causes a greater deterioration in performance 
the higher the gain of the display, should not 
lead to the recommendation that display gain 
should always be set as low as possible in any 
practical control system. It must be remem- 
bered that the control designer works to 


mini- 
mize system error, regardless of what hap- 
pens to the error at the display. Although 


stress does exact a heavier toll with higher 

magnifications, it must be recalled that the 

higher gain systems, in terms of system error, 
D 


W. D. Garvey and Jean B. Henson 


always remained superior to the lower m 
every form of stress employed in this stagii 


Summary 


An experiment was conducted to determine 
how the performance of man-machine systems 
is affected by factors which stress or me 
load the human operator. Five systems, 
of which had the same dynamics but differe: coe 
in display magnification, were employed. OP- + 
crators were given considerable training on a J 
five systems, after which they were require? 
to control the systems under a series of stress 
ful conditions. Performance was reas 
both in terms of system error and error & 
the display. , agd 

The results indicated that stress increas i 
error in all systems, but the order of me 
of the various systems was unchanged A 
stress. The results are discussed in oy 
their relevance to the study of man-machind 
systems. The pitfalls of purely psychologic? 
interpretations of the behavior of tracking 
systems are outlined. 


Received September 19, 1958. 


References 


e 
Battig, W. D., Nagel, E. H., & Brogden, W. J. kt 
effects of error-magnification and marker r 
bidimensional compensatory tracking. Ame" 
Psychol., 1955, 68, 585-594. . ips 
Bowen, J. H., & Chernikoff, R. The relationship 
between magnification and course frequency p, 
compensatory tracking. U. S, Naval Res- 
Rep., 1957, No. 4913. » stress 
Garvey, W. D. The effects of “task-induced Nava 
on man-machine system performance. U. S$- 
Res. Lab. Rep., 1957, No. 5015. «al hu- 
Helson, H. Design of equipment and opi 413- 
man operation. Amer. J. Psychol, 1949, 62: 
497. ects 
Lazarus, R. S., Deese, J., & Osler, S. F. Ühe al 
of psychological stress on performance. 7 
Bull., 1952, 49, 293-317. 
Wilcoxon, F. Some rapid approximate St 


j nam! 
Procedures. Stamford, Conn.: American Cyan’ 
Co., 1949. 


atistic® i 


Journal Appli 5 
vanas, Wg ios” aholags 


EFFECTS OF FEEDBACK 


eo 


4 
] 


EWART E. SMITH aw 


ON INSIGHT AND PROB- 


LEM SOLVING EFFICIENCY IN TRAINING 
GROUPS 


p STANFORD S. KIGHT 


f Fels Group Dynamics Center, University of Delaware 


Human relations courses in management 
development programs are currently very 
_ Popular, despite a dearth of experimental evi- 

dence supporting the postulated learning prin- 
ciples used in constructing these courses. 
Two such principles have been tested in the 


i; field experiment reported here. 
The hypotheses were: 


Pi saga will: (a) increase group PTO 

ivity and (b) increase self-insight. 

i ms Subgroup structure will result in: (a) 
creased group productivity and (b) in- 

creased self-insight. 


Method 


ae and situation. The subjzct 2 
in a Jar 49 female first level supervisors (foremen) 
Patin wine eastern corporation. They were partici- 
which in a five-day management training course 
Phe stressed interpersonal factors in management. 
role cas emphasizes experiential learning, with 
aible a exercises, demonstrations of perceptual 
short, ity, group discussions, etc, and only a few 
ing eu lectures. Nine training groups vary- 
contai size from 11 to 13 were used. Each group 
Moe ee both men and women. The Ss did not 

Ex they were participating jn an experiment. 
X<perimental design. In Experimental Condition 
eon feedback groups of three or four members 
eran: do established. These feedback subgroups 
re scheduled to meet for 30 minutes at the end of 


e 

a half day. However, 60 minutes per day proved 

e more time than was readily available. Less 
s as the 


a 

age time was given to t athe 

comp commitment to research 

Were E staff faded with time. 

iù obtained on the amount O 

subgroups. 

ta Ss were given an out 

eedback szssions, which 


0 descri 

ribe to his subgroup how 
avi subgroup h 7 
Vior, why he behaved as he did, etc. The sub 


heed members were then asked to compare a 
A ceptions of his behavior with his account = tu 
ig the subgroup was asked to give the n> 
tadi ual feedback on how his behavior affecte ae 

ership problems of the group» whether he talke 


jects (Ss) were 54 


Consequ' 


f time actually spent 


line to guide them in 
asked the individual 
he saw his own be- 


too much or too little, whether he was trying out 
new behavior or playing it safe, etc. 

Condition B was similar to Condition A in that 
the Ss were placed in subgroups of three or four 
members each. These subgroups were scheduled to 
meet for 30 minutes at the end of each half day, to 
discuss the course content covered the preceding half 
day. Such topics as what they had liked best or 
least, what they had learned, etc. were covered. 
Condition B was designed as a control condition to 
determine if any changes produced by the feedback 
in Condition A could be attributed to subgrouping 
rather than feedback, and also as a test of the pos- 
tulate that subgrouping per se will increase learning 
in training groups. 

Condition C was a straight control condition with 
no induced feedback or subgrouping. 

Three training groups were assigned to each con- 
The courses were conducted in the usual 
pany’s training staff. Each group 
had two leaders who had had some training and 
considerable experience in conducting these courses. 
These leaders were males holding third-level super- 
visory positions jn the company. 
to the group as a consultant at the end of the course 
on Friday afternoon, when the testing was done. 

Dependent variable measurements. Group produc- 
tivity was measured by the number of problems 
solved by each three person subgroup. In Condi- 
tions A and B, the subgroups tested on the problem 
solving task were s which had been 
meeting during the wi k, or to discuss 
the content of the course. In Condition C the sub- 
groups were produced for the problem solving task 
by placing the first three people sitting together in 
one subgroup, placing the next three sitting together 
i e next subgroup, etc. : 
Ë om nen jem soving task (Taylor g Faust, 1952) 
is an adaptation of the parlor game “Twenty Ques- 
tions” in which Ss had to identify items desig- 
nated by the experimenter as either animal, vege- 
table, or mineral. Each problem had to be solved 
with less than 40 questions for the subgroup to re- 
ceive credit. No time limits were applied to the 
solving of particular problems, but the productivity 
of each subgroup was measured by the number of 
problems solved in 10 minutes. To each question 
asked by 4 group member, E replied in one of the 
following ways: ) no, (c) partly, (d) 
sometimes, (e) ni of the word, 
(I don’t know (no charge for the question), (g) 
please restate the question (if the question was un- 


209 


210 


clear or could not be answered in one of the above 
ways). Difficult items such as wrench, ruby, and 
bread were used. This task has, in other investi- 
gations (Goldman, Bolen, & Martin, 1958; Smith, 
1957; Taylor & Faust, 1952), successfully discrimi- 
nated between groups whose effectiveness could be 
j d on other criteria. 
pred a was measured by having each conferee 
indicate, for each of 10 leadership roles, who (in- 
cluding himself) took the role well. An insight score 
was computed by determining the discrepancy (ig- 
noring the sign) between the number of roles S 
credited to himself, and the average number of roles 
credited to him by the two staff leaders. Self-insight 
of foremen has been shown to be highly related to 
the productivity of their departments (Nagle, 1954). 
An anonymous evaluation of the training course 
was also obtained from each conferce by asking him 
to rate the course on a five point scale in the fol- 
lowing three areas: how much he felt he benefited 
from the training, how much he felt he learned about 
himself as a result of the training, and how helpful 
he felt the training would be back on the job. 


Results and Discussion ' 


As an analysis of the data on all measures 
indicated that Control Conditions B and C 
were not different, they have been combined. 
This failure of Conditions B and C to differ 
indicates that subgroup structure did not 
have the hypothesized effect on learning and 
that any effects occurring in Condition A re- 
sult from feedback alone, 

Plotting the group task scores revealed that 
normal distributions could not be assumed. 
Consequently, a nonparametric rank test was 


Table 1 


Mean Insight Scores Under Experimental 
and Control Conditions 


Data Under Each Condition 


Feedback 
Statistic on Insight eai 
Scores I © m Control 
Number of minutes in 
Feedback subgroups 230 193 165 0 
Mean .90 1.73 2.38 2.41 
n 10 11 12 69 


— scores indicate insight CR on Feedback Į and 
Control, Site T on Peitai pa Feedback T oe 


TI, 2.12, 
1 Statistical significance at the .05 level has been 
indicated by a single asterisk; at the 01 level or 
better by a double asterisk. 


Ewart E. Smith and Stanford S. Kight 


used (Edwards, 1954). By this method a 2 a 
obtained which is a normal deviate, evalua 
by the use of the normal probability ae 
The median number of problems solved H 
the 10 three-person subgroups in the Feedba 
condition was 2.00, compared to a median a 
zero for the 21 control subgroups. The 5 
on these data is 2.27*. These differences es 
highly consistent, as revealed by ring 
ing the subgroups according to whether rt 
solved one or more problems or were a 
to solve any. With such an analysis, we i aa 
that all of the feedback subgroups solved A 
least one problem, compared to nine of 00 
control subgroups (the p on these data is ‘ 

by Fisher’s exact test). 

Aet then, O adag produces 
greater problem solving efficiency. It H i 
be noted, however, that these data do z e5 
dicate whether feedback made the con i ing 
more effective members of problem s0 = 
groups in general, or only made these P’ 
ticular subgroups more efficient. om- 

If all feedback and control groups are “sig 
pared on insight scores, they are not in 
nificantly different. However, as we see 
Table 1, the strength of the feedback D a 
pendent variable was gradually reduced 15 
decline in the amount of time the eater g 
training staff devoted to the feedback “ec 
groups. It is interesting to note the a 3 
negative correlation between the amoun the 
time spent in feedback subgroups and in 
mean insight error scores. As indic ate ame 
Table 1, the first feedback group, which ba a 
Closest to receiving the specified ener 
time in feedback subgroups (300 ge 
had significantly lower insight scores con 
either the last feedback group or the 
trol groups, 

The mean rating of the course by th 
ferees in the feedback condition et 
which is significantly lower (CR 2.27 ‘ 
the mean of 13.66 for the control con dba 
In view of the superiority of the fee ping 
condition as indicated by the problem 50° nt 
and insight measures, it would appea rees 
the frequent use of the opinions of coP ram 
for evaluating industrial training pr°8 
should be re-evaluated. 


e con” 
42.97; 
than 
jtio?- 


my- 


EEE 
eee sss 


Effects of Feedback 


Summary 


An experiment was conducted in a field 
setting to investigate two of the learning prin- 
noes utilized in human relations courses. 

he Ss were 103 first line supervisors, in 
Soups of about 12, in a one week, highly 
Participative management course. In the ex- 
coe condition, the training groups were 
se ea. into „three-person subgroups which 
bed aes daily to give each other personal- 
a eedback on their behavior in the group. 
a one control condition, there were twice 
th. meetings of three person subgroups 
i my discussed the content of the course, and 

e other control condition there was no 
Subgrouping. 
aly hypotheses tested in the present study 


duy ear will: (a) increase group Pro" 
ctivity and (b) increase self-insight. 
ha Subgroup structure will result in: (a) 
cr eased group productivity and in- 

eased self-insight. 
zed feed- 


as data indicated that personali 
ck markedly, and consistently, improved 


211 


group problem solving efficiency. Under some 
conditions, feedback improved self-insight. 
Anonymous evaluations of the course by the 
trainees favored the control conditions, indi- 
cating that conferee ratings may not be an 
adequate basis for evaluating such courses. 
The hypotheses regarding subgroup structure 
were not supported. 


Received October 13, 1958. 
Early Publication. 


References 


Edwards, A. L. Statistical methods for the behav- 
ioral sciences. New York: Rinehart, 1954. 
Goldman, M., Bolen, M. E., & Martin, R. B. The 


effect of group structure on the 
groups engaged in a problem solving task. Amer. 
Psychologist, 1958, 13, 353. 

Nagle, B. F. Productivity, employee attitude and 
supervisor sensitivity. Personnel Psychol., 1954, 7, 
219-233. 

Smith, E. E. The effects of clear and unclear role 
expectations on group productivity and defensive- 
ness. J. abnorm. soc. Psychol., 1957, 55, 213-217. 

Taylor, D. W. & Faust, W. L. Twenty questions: 
Efficiency in problem solving as a function of size 
of group. J. exp. Psychol., 1952, 44, 360-368. 


Journal oj Applied Psychology 
Vol. 43, No. 3, 1959 


PERSONALITY TEST SCORES IN THE MANAGE- 
MENT HIERARCHY: 


REVISITED 


HENRY D. MEYER axp ALAN J. FREDIAN 


Stevenson, Jordan & Harrison, Inc., Chicago 


One major purpose of this study was to re- 
peat, with an entirely new sample, the man- 
agement hierarchy validation study of the 
Employee Questionnaire, the personality test 
developed and used by Stevenson, Jordan & 
Harrison psychologists in their management 
personnel appraisals. The original study 
(Meyer & Pressel, 1954) was done on 459 
cases, covering the period of July, 1949 to 
February, 1952. The present study was done 
on 678 cases, covering the period from July, 
1955 to May, 1957. The soundness of the 
use of the Employee Questionnaire in the ap- 
praisal of management candidates depends, in 
part, on the consistency of management hier- 
archy trends in its various personality scales, 
This consistency was to be tested. 

A second major purpose of this study was 
to perform, for the first time, a management 
hierarchy validation study of nine new scales 
added to the Employee Questionnaire follow- 
ing the original study. The seven scales of 
the EQ-B of the original study were objec- 
tivity, social dominance, social extroversion, 
drive, detail, emotionality, and adjustment 
(poor). The additional nine scales which, 
together with these original seven scales, con- 
stitute the EQ-C of the Present study, were 
social Consideration, judgment and decision, 
adjustment somatic, Psychopathic tendencies. 
drive persistence, recognition anxiety, per- 
sonal achievement motivation, compensatory 
achievement motivation, and independent 
achievement motivation. Tp addition, the 
original drive scale was modified to a more 
limited scale: drive irritability, 

In the original study (Meyer & P 
1954), social dominance, detail, emotio 
and adjustment (poor) were found to have 
observable trends and Statistically significant 
Fs with hierarchy. Also, efforts were made 
to control for age, education, occupation, and 
bias. The general aim of the present study 


ressel, 
nality, 


was to parallel, as far as possible, the original 
study on a new sample tested with the m att 
fied and expanded Form C of the EQ pr al 
ality test. However, the results for En 
tion will not be presented in the present stt 


Procedures 
The Hierarchy Categories 


ent 
The original concept of the industrial maa 

hierarchy (Meyer & Pressel, 1954) was mo g tech- 
include an additional level of jobs, labeled tog 
nicians, as Level V just above general facto pnician 
clerical employees now at Level VI. The m > mans 
level encompassed such job titles as time sey 
process man, production scheduler, draftsman, 
designer, lead man, and group leader. iginal)» 
the hierarchy (Table A) 1 was the same as O" officet® 
but generalized as follows: Level I, corporia < man- 
and division general managers; Level II, wo" non 


. reg 0 s 
agers and heads of major functional are partments 
officer status; Level III, supervisors of 1 sales and 
of major functional areas and exceptiona’ Le m 
engineering staff jobs of equivalent status; ae a 
first line production and office apeiti = gener? 
specialists; Level V, technicians; Level + 
factory and clerical employees. 

The Sample jeves 
EET hy 
Inasmuch as in three out of six hierar¢ 100 


the new data sample would not yicld an fa 954): 
cases as in the original study (Meyer & Eres prais? 
it was decided to take from the S. J. & H. m u 
files, all available cases at each level that or 
original criteria (Meyer & Pressel, 1954) hy 
tion and placement in a particular hierati er 
The end result was a relatively small nu” ae 
cases for Levels I and II, 30 and 44 cases? 


ns, 
Ty itio Ss 
*A detailed summary of hierarchy eet rend 
tercorrelations, hierarchy, education and tatisS 
comparative magnitudes of trends, an 

analysis of hierarchy trends for all EQ- 
found in Tables A, B, C, D, E, F, G, e 
been deposited with the American Documen, 
stitute. Order Document No, 5884 ee 
iliary Publication Project, Photoduplicat! 
Library of Congress, Washington 25, 
ting $1.25 for 35 mm. microfilm or es 
copies. Make checks payable to Chief, 
cation Service. 


8 in. 
pho 


Personality Test Scores 


S i 
oe good N for Levels III, V and VI, 144, 78, 
ae 9 cases; and a very large N for Level IV, 253 


The Personality Test 


ae for the Present study were obtained from 
BOG Gal Gaydon’ bec ae: ‘of the EQ-B. The 
items, consi leveloped by an item analysis of 142 
the EQ-B isting of most of the original 75 items of 
of the Sand additional items designed to form six 
analysis, e new scales listed previously. In_ this 
that had 1 major effort was made to eliminate items 
key, Phage correlations with objectivity, the bias 
aS possible i c attempt was made to reduce, as far 
e end pa the use of items on more than one scale. 
and item esult, after a second round of development 
tion scaled alyas for the three achievement motiva- 
12 items. wh a 105-item test with scales of 7 to 
EQ-c a6 y While similar scales for the EQ-B and 
relations b ie have exactly the same items, the cor- 
34 Beacon ween similar scales ranged from r=+ 
62 fakman drive and drive irritability, and r= + 
chological adenant (poor) and adjustment psy- 
r= Es f o r= + 80 for social dominance and 
he. ae detail (Table B).* 
Scales of So ge of six of the nine additional 
ing the w pé EQ-C was for the purpose of broaden- 
and to ne, interpretations derived from the test 
Coverage es e up for some previous deficiencies in 
Management characteristics believed to be critical to 
Signed to poppei: Social consideration was de- 
Ple sympath a gap in the area of dealing with peo- 
n ormati etically; judgment and decision, to ad 
tive Bet about decisiveness and impulsiveness; 
an ONE to reveal staying power rather 
Veal susce sity of effort; adjustment somatic, to re- 
and pers Dtibility to psychosomatic reactions to job 
reveal aas pressures; psychopathic tendencies, to 
lety, to r ìsocial inclinations; and recognition anx- 
Selection a anxiety about pleasing superiors. The 
Iy item $ scale items was by face validity followed 
ig] ates of all the items of the test against 
é tk ow scoring groups on each scale. i 

troduced ree achievement motivation scales were 1- 
er se a the final selection of items for the 
lelang’, es of the EQ-C. They were based on Mc- 
i saii (1953) research on achievement motiva- 
tion a (1929) compensation theory of mo- 
y selecti he personal achievement scale was formed 
Which cting, from all the established scales, items 
conformed to McClelland’s signs of achieve- 

Wa dactivation. The independent achievement scale 
Mg g veloped from new items relating to the foster- 
dates independence and resourcefulness by the candi- 
Cletaga nts and environmental circumstances (Mc- 
Was i à 1953). The compensatory achievement scale 
teject. Oped from new items relating to parental 
Sequenti, (Adler, 1929). These three scales were sub- 
Y subjected to an independent item analysis 


anq 
Odified b 
nt, accordingly. 
(Tape Correlations of the 16 scales of e EQ-C 
), based on 100 cases, ranged from r= 00 


tiva 


213 


to r=—.63, the latter between emotionality and 
judgment and decision. Out of 120 intercorrelations. 
14 had an r of + or —.50 or higher. Of these 14, 
most had to do with the relations between social 
consideration, judgment and decision, emotionality 
adjustment psychological, adjustment somatic, drive 
irritability, and recognition anxiety. These intercor- 
relations generally indicated that low judgment and 
decision and low social consideration scores went 
with high irritability, emotionality, anxiety, and poor 
adjustment scores, and that scores on one kind of 
poor adjustment scale went with scores on another. 
The latter was to be expected in the light of the 
attempt, in constructing the scales, to identify more 
refined subcomponents of a general adjustment factor. 

While the test-retest reliability of the EQ-C scales 
has not yet been measured, previous test-retest reli- 
ability of the EQ (Rothe, 1950) at r=.69 to .84 
has been satisfactory for its use as an appraisal aid. 
The EQ-C and earlier forms have sacrificed the split- 
half reliability that might have been obtained with 
fewer and longer scales of 40 items or more. Such 
long scales are not feasible in a short industrial ap- 
in the aim is to get as many clues 


praisal test wherein 
about individual characteristics as possible in a 


short time. These clues can then be probed in an 
intensive follow-up interview. 


Statistical Procedures 

ures 2 were similar to the original 

study (Meyer & Pressel, 1954) in that mean scale 

scores for different hierarchy groups, educational 

groups, and age groups were computed. Again, a 
analysis of variance for 


ification F test 
test the statistical sig- 


single classi 
hierarchy alone was used to 
e observed mean differences on hier- 
(Table D). The three 


nificance of th 

archy for each test scale 

scales lacking homogeneity of variance were trans- 

formed by the arc-sine transformation (Snedecor, 

1946) for the analysis of variance (Table D).* For 

two scales, where the data were too skewed to meet 

assumptions of normality, the data were ranked be- 

‘ore computin: 

i aeons of he irregular V in the hierarchy groups, 

a different approach was use the inde- 

pendence of hierarchy trends 
double classifi 

(Meyer & Pressel, 1954). 


f the number of changes, 
of the mean EQ-C scale 


Statistical proced 


previously 
was made 0 
nant direction of change, ; 
score for each step up the hierarchy (variable group) 
within each separate educational or age (nonvari- 
able) group (Table A)3 The probability of this 


aie eS 

2 The authors are indebted t 
carrying out the statistical proce 
ing to their formulation. se 
ms A detailed report of the statistical procedures. of 

i d, along with detailed findings on e in- 
p nA : ds in hierarchy, education 
e A, has been deposited 


ican Documentation Institute. Order 


5883 from ADI Auxiliary Publica- 


o Jere P. Wilson for 
dures and contribut- 


Henry D. Meyer and Alan J. Fredian 


Table 1 
EQ-C Test Scale Means and Sigmas According to Hierarchy Level 


Hierarchy Level 


I 1 I 1v v. M 

N=30 N=4 N=144 N=253 N=78 pie 

EQ-C Scale M SD M SD M sp M SD M SD M 

5.7 1 

Independent Achieyement*** Ta 17 6.5 2.3 6.3 1.9 6.1 1.8 5.7 ee So 15 

Detail*** 54, 12 46 12 49 1.6 5.5 1.6 5.8 1,2 55 19 

Social Dominance*** es 21 6519 6819 G49 56 21 T 

Judgment and Decision** 6.0 1.6 69 1.5 6.7 1.6 6.4 1.7 6.3 1.6 45 17 

Recognition Anxiety*** 43 13 3.6 1.3 3.6 13 41 16 4.3 1.5 45 15 

Drive Irritability** 47 14 3.7 18 41 16 43 15 45 14 ee 17 

Adjustment Psychological* 2.8 1.7 2.6 1.6 2.9 1.5 3.0 1.8 SS 16 7A 2.0 

Social Extroversion** 75 20 74 21 if 22 8.0 1.9 71 23 57 21 
Objectivity Osh 2i Si 27 6.3 1.9 6.0 24 5.9 23 y 


*, **, and *** indicate, respectively, 


g by chance was com- 
independence is weaker 
than the double classification F test, relative to 
Proper consideration of the size of the N and the 


, it is a strong 
changes in the 


Results 
Hierarchy Trends 


nance, Psychological. 
Emotionality in the EQ-C did not achieve a 
significant F at the 05 level although it had 
a small trend. Similarly, objectivity and so- 
cial extroversion did not have observable hier- 
archy trends in either the EQ-B or EQ-C. 
Social extroversion had F test Significance jn 
the EQ-C, but not in the EQ-B. Drive irrita- 
bility did have an observable trend and sig- 
nificant F in the EQ-C, but drive did not in 
the EQ-B. 


tions Project, Photoduplication i 
Congress, Washington 25, D. C, remitting $1.25 for 
35 mm. microfilm or 6X 8 in. photocopies, Make 
checks payable to Chief, Photoduplication Service. 


Service, Library of 


significance at the -0S, .01, and .001 levels for a single classification F test. 


Of the seven original scales of the ge 
detail (downward) and social dominance zed* 
ward) had the second and third most ee a 
scale trends with hierarchy (Table F) “niet” 
the EQ-C scales, and were two of a tes 
archy trend scales in the EQ-C with | o s0- 
significance at the .001 level. Ascend "ecore 
cial dominance and descending detail as ing 
Seem consistent findings in the ere 
management hierarchy for repeated s eni 

Of the new scales of the EQ-C, indep® rked 
achievement was not only the most mF (uP 
in terms of Magnitude of hierarchy Seer also 
ward) of all the EQ-C scales, but it wa re n 
significant at the .001 Jevel and there we jer- 
reversals of the trend at any of the we the 
archy levels, Recognition anxiety Þa i 
fifth most marked hierarchy trend i 
ward) and was significant at the or 
Judgment and decision (upward) an an 
irritability (downward) were the fout zQ- 
sixth most marked trend scales in oi 
and both were significant at the -01 ees 

Of the remaining new scales, social wie 
eration (upward), adjustment somatic ward)? 
ward), psychopathic tendencies (dow? ows” 
and personal achievement motivation 


jeve! 
ariv? 


be 
afferen? pels 
* Referring to the magnitude of the differ levd 
tween the average scale scores at the hiera he tr 
constituting the bottom and the top 0 
covering several hierarchy levels. 


Personality 


N 
ae 
on 


Test Scores 


Table 2 
EQ-C Test Scale Means and Sigmas According to Education Level 


Education Level 


m 4 Years 2 Years 
4 Years 2 Years High High 
a Graduate College College School School gate 
N=60 N=189 N=117 N=246 N=29 N=34 
EQ-C Sca 
Q-C Scale M SD M SD M SD M SD M SD M SD 
In F 
oe Achievement 69 1.9 6.2 18 64 1.6 ST 17 6.0 2.0 54 24 
Secs iade sois «(54 14 SR 1S DIa Gl l 
Judgme minance is 69 19 6.6 14 64 17 5.6 2.0 6.2 19 SO 17 
ae nae and Decision 68 14 68 1.4 6.5 18 6.1 1.6 5.6 1.6 5.0 18 
Davee Anxiety 3.2 1.2 37 15 3.9 15 44 14 46 15 5.2 19 
Fd 3.7 18 41 14 4.3 1.6 45 15 45 17 47 1.6 
och te, Psychological 24 14 2.7 15 2.9 1.6 33: 1.7 3.6 14 4.3 1.9 
Cie 82 19 79 2.2 7.9 2.0 7.8 2.1 7.3 19 6.5 24 
mity 64 24 6.2 23 6.0 2.1 5.8 2.2 5.8 2.2 5.6 2.0 


W 
M had slight hierarchy trends and 
Significant at the .05 level. 
diserna EQ-C scales, only four have no 
e ble trends with hierarchy. They are 
ive a social extroversion, drive persist- 
Bane compensatory achievement motiva- 
Sienific nly four scales did not achieve F test 
05 Ie mages of their hierarchy means at the 
Shon ve or better. They were objectivity, 
Sif onality, drive persistence, and compen- 
ty achievement. 
s he greatest difference in hierarchy trend 
cet between the original study (Meyer & 
ined 1954) and the present one is the 
arked failure of the top hierarchy group to 


follow the otherwise observable hierarchy 
trends (Table 1) (Table E).* In all the 
12 hierarchy trend scales, except independent 
achievement motivation, the Hierarchy Level 
I average constitutes a reversal of the hier- 
archy trend established in the previous levels. 
In the original study, there was no such re- 
versal for Level I (Meyer & Pressel, 1954). 


Education and Age Trends 
These trends are shown in Tables 2 and 3 
and Tables G and H.* In the current study, 


education trends in the EQ-C scales were 
much more marked than in the previous study 


Table 3 
EQ-C Test Scale Means According to Age Level 
= 
Years Years Years Years 
20-30 30-40 40-50 50+ 
N=126 N=318 N=173 N=S57 
EQ-C Scale M 4 M M 
Independent Achievement 5.9 6.2 5.9 
Detail 337 5.3 5.1 é 
Social Dominance 6.1 en Ee a 
Judgment and Decision 6.5 = H Pe 
Recognition Anxiety 41 i3 re a 
Drive Irritability 4.3 a6 ay sd 
Adjustment Psychological 2.8 me ne 
Social Extroversion 7.7 Pi a 
Objectivity 5.8 . . $ 


216 


(Meyer & Pressel, 1954), possibly because 
the education categories were extended to in- 
clude graduate training at the top and two 
years of high school and grade school at the 
bottom. Furthermore, all 16 EQ-C scales had 
observable trends with education. The mag- 
nitude of trend differences was greatest for 
education and least for age (Table F).? Av- 
erage magnitude of all observed scale trends + 
in absolute mean score values was 1.42 for 
education; .87 for hierarchy; and .54 for age. 

With the exception of detail and those 
scales that had no trend for age, the direc- 
tions of observed scale trends with education 
were opposite of those for age, whereas all 
observed hierarchy trends were in the same 
direction as those for education (Table F). 
While no single classification F test of sig- 
nificance of mean EQ-C scale score differ- 
ences was carried out for education and age 
categories because of budget limitations, it 
might well be predicted from the magnitude 
and consistency of the trends that many of 
the scales having trends with education and 
few of the scales having trends with age 
would also have F test significance. It should 
be kept in mind that in each of these single 
variable classifications of hierarchy, education 
and age, as shown in Tables 1, 2, and 3, only 
the variable described is controlled. Actu- 
ally, in this management sample, all three 
variables are interrelated as shown in Table 4, 


Independence of Trends 


Table A? shows the trend results for hier- 
archy, age, and education when two of the 
variables are controlled, In the execution of 
the previously described procedures * for com- 


Table 4 


Age and Education Means Acc 


; ording to 
Hierarchy Level 


Hierarchy Level 
g u wW wy y VI 


Age* 44.7 43.0 40.0 369 335 324 
Education** 47 50 46 39 33 29 


* In years. E 

#62 Graduate Level Work; 5 = 4 Years College; 4 = 2 
Years College; 3 = 4 Years High School; 2 =2 Years High 
School; 1 = Grade School. 


Henry D. Meyer and Alan J. Fredian 


puting independence of trends in a douh 
classification system without using analys 
variance, none of the hierarchy trends of 4 
EQ-C scales was found to exist independen 3 
of education at the .05 level of significon 
when education was controlled. When oe 
was controlled, hierarchy trends persisted A 
the .05 level of significance for social aor 
nance, judgment and decision, detail, ada 
ment psychological, and psychopathic te 
encies. :< held 
On the other hand, when hierarchy 4 él 
constant, education trends at the .09 Fae 
of confidence persist in the EQ-C coal 
judgment and decision, detail, psy chops 
tendencies, recognition anxiety, compensa ac 
achievement, and independent achieve” 
With hierarchy controlled, significant a 
trends persist for objectivity, social con a 
eration, judgment and decision, and arive HA 
sistence (Table A).® All of these sign! aa 
trends with two controlled variables wn 
the same direction as the trends for € yati- 
tion, hierarchy, and age when only one 
able is controlled. 


Discussion f dis- 
Several points need the clarification sspe” 
cussion to place this study in clear Pete f 
tive. These points are: first, the obs ce of 
rather than statistically proven, existen ity 
the EQ-C scale trends; second, the Pe 19 
of interpreting the observed trends a pility 
differences in objectivity or social nee ne 
bias; third, the possibility of interpret of 
observed hierarchy trends and pren n 
trends at the top level as a function poth 
derlying education and age trends 9 com 
Personality scale scores and hierarchy inde 
position; fourth, the possibility tha onl 
pendent achievement motivation is t ror 9 
independent personality scale predic end? 
hierarchy, the other personality scale cot 
being by-products of selection by ¢ 
and advancement by age (experience) 


Validity of Observed Trends oboe the 
It should be kept in mind that pet oe 
single classification F test nor the ction i 
of changes in the predominant dire th 
change tests are final statistical pro° 


Personality Test Scores 


validity of trends. Trends reported in this 
study are observed upward or downward pro- 
gressions of scale score means in successive 
igi of hierarchy, age, or education. 
— ey em of the changes and the num- 
jud of reversals to be allowed is a matter of 
a: ee In this study, the F test analysis 
eae helps the certainty of the trend 
ee in that it implies that the ob- 
en gn between the means are not 
Tt ir o chance and are likely to be repeatable. 
n no way certifies that there is a linear 
eresi, In general practice, however, 
“ies be a certain magnitude of mean 
Ron ifferences and consistency of progres- 
fest z the scale will not come out with F 
ena However, curvilinear rela- 
: ips, as well as linear trends, will also 
ar F test significances, as occurred with so- 
the extroversion. Therefore, the F test of 
TLD classification analysis of variance 
lerma not prove a trend as we have used the 
is ob m this study. The standard for trend 
a roan of progression, supported by 
of test for verification that the observation 
Mean differences is not due to chance. 
ae inadequacies of the number of changes 
Si of trend have been pointed out. It as- 
res the improbability of such a consistent 
Progression occurring by chance, but it does 
a insure that the means on which the pro- 
mahn is based are reliable. This test was 
T for the testing of independence of 
= s from one of the other variables be- 
Asum of the unsuitability of the data to the 
e classification analysis of variance. 


Objectivity and Social Desirability Bias 

In the original study, objectivity seemed 
Much more of a general bias indicator than in 
t € present study. It is to be remembered 
hat in the revision of the EQ-B to form the 
e special care was taken to eliminate 
ems from other scales that also discrimi- 
nated between high and low objectivity 
iat In doing so, the objectivity scale 
ems to have been changed 50 that it no 
Onger is as independent of hierarchy, 28° 
and education levels. High objectivity, de- 
ned as willingness to admit to some minor 


Undesirable behavior characteristics, has be- 


217 


come a favored trait, one which declines with 
age and increases with education (Tables 2 
and 3) (Table A).° However, there is no 
evidence that the trends in hierarchy are a 
function of objectivity bias (Table 1). While 
the hierarchy trends are generally in the di- 
rection of a more socially desirable profile of 
scale scores, there is no evidence that objec- 
tivity scores parallel and make possible this 
progression as the hierarchy ascends. 

Inasmuch as social dominance and social 
extroversion appear to be relatively unrelated 
to objectivity bias in the EQ-C, as shown by 
intercorrelations (Table B), an extensive so- 
cial desirability bias study along the lines 
laid down by Edwards (1957a) was under- 
taken and is now well along. It uses the data 
of the present study and there is enough 
analysis completed to indicate that degree of 
social desirability bias is certainly a strong 
determinant of magnitude of scores on the 
more obviously socially desirable or undesir- 
able scales such as social extroversion and ad- 
justment somatic, but not with almost neutral 
scales such as detail. This social desirability 
study also indicates that most hierarchy and 
education trends of the present study are in 
a socially approved direction, and age trends, 
such as they are, generally are not. How- 
ever, there are some notable exceptions to this 
generalization. Detail has declining trends 
with all three variables and is socially neutral. 
Objectivity is socially undesirable and social 
extroversion is socially desirable, but neither 
have trends with hierarchy and both have up- 
ward trends with education. 

As Edwards (1957a) has pointed out, peo- 
ined to answer personality test 
questions in a socially desirable way, toa de- 
gree depending on the social desirability of 
the question. They still differ rather widely 
in their willingness to attribute socially un- 
desirable behavior to themselves, depending 
; on the strength of their personal 


possibly ength 0 C 
attitudes of social inferiority or social su- 
periority- The question is whether individual 


in this area can account for the 


differences 1 i 
major part of these trends of personality test 


scores in management hierarchy or, for that 
matter, in education and age. The answer to 
this question must await an intensive analy- 
sis of the data of the social desirability bias 


ple are incl 


218 


study. While the F tests of hierarchy, edu- 
cation, and age means of the new 40-item 
social ‘desirability bias scale of the EQ-C are 
not complete, the means themselves have no 
observable trend with hierarchy or age. How- 
ever, there is a distinctly observable upward 
trend of this social bias scale with education. 
Studies are under way of the relation of so- 
cial desirability bias scorés on the EQ-C to 
scale scores on the Edwards Personal Pref- 
erence Test (Edwards, 1957b), which also 
will help to obtain a better answer to this 
question, 


Hierarchy Level I Trend Reversals 


Another finding which requires further dis- 
cussion is the marked failure of the top hier- 
archy group to follow the otherwise observ- 
able hierarchy trends in EQ-C scale average 
scores, While it is admittedly possible that 
the reversals of observed trends in the top 
hierarchy group were due to an inadequate 
sample (30 cases in Level I), another explana- 
tion seems more plausible. The suggested hy- 
pothesis is that the trend reversals in Hier- 
archy Level I are a function of the composi- 
tion of the sample with regard to actual age 
and education. While there is an actual age 
and education trend with hierarchy that is 
quite marked (Table 4), in Hierarchy Levye] 
I, the actual education trend is reversed. 
That is, the people in Level I haye some- 
what less education on the average than those 
in Level II, but have a higher age. Since the 
hierarchy trends in the EQ-C scales tend to 
be in the same direction as the education 
trends in the EQ-C scales, a reversal of EQ-C 
scale trends with a reversal of actual edu 


f nd to be the opposite 
of EQ-C hierarchy trends, and there is an 


actual age trend in the hierarchy, it would be 
expected that the reversal for this maximum 
age group of Level I would þe greater than 
that accountable for by the education re- 
versal alone. Hence, more specifically, the 
reversal of the actual education trend and the 
continuation of the actual age trend could 
account for a good part of the marked trend 
reversals observed in EQ-C scales for Hier- 
archy Level I. 


Henry D. Meyer and Alan J. Fredian 


The continuation of the hierarchy trend for 
the independent achievement motivation sa 
is the main exception to this hypothesis, an 
suggests that being in Level I rather than in 
Level II in the management hierarchy is more 
a function of greater achievement see ya 
or independent resourcefulness than of ae 
personality in other respects, as defined by 
society. It should be kept in mind, of ‘age 
that Hierarchy Level I and IL samples S 
relatively small and that they are largey 
drawn from medium-sized, rather than epee 
companies. What determines becoming a 
officer or general manager in small and rie 
dium-sized businesses may not be the dete 
minant for very large businesses. 


Personality Versus Experience and we 

Two conflicting general hypotheses ter 
ing personality test scores in the m 
ment hierarchy are suggested. The p in- 
that there are no strong trends except fo! hat 
dependent achievement. And, except for mai 
one scale, the trends observed are a cine 
of age and education influences on scale an 
and the actual age and education compost 
of the management hierarchy. Accordinë 
this hypothesis, management selection | nce 
advancement is by education and a in- 
(age), supported by resourcefulness an ob- 
dependence in achieving job results. TR 
served hierarchy trends in personality ent, 
Scores, except for independent achievem@’’ 
are accidents of educational and age ” 
ences on personality, É t po 

The second general hypothesis thers re 
sition in the Management hierarchy iS Re in- 
sult of a selective process whereby mor a 
telligent ë people with better personalitie le 
defined by society, and stronger indepe? ris 
achievement motivation generally tend ee eri- 
higher in the hierarchy with age and eP®, 
ence than their colleagues less talente co 
these respects. In this process, they ameainst 
Stantly fighting a rear-guard action ae 
the retrograde effects of age in all thee call 
ity variables, According to this hypo 


nas 
28 
5A hierarchy study of intelligence test 5601% H: 
been completed by É. L. Kendall of the Suie pres 
permanent staff, using the same sample 3S, ation ê 
ent study. It is being prepared for publica 
this time, 


Personality Test Scores 


People with superior ability in intelligence, 
Personality, and achievement motivation also 
tend to achieve a higher education level. At 
the time of testing, education is completed 
while hierarchy ascension is not. Therefore, 
the relationship of these measures to educa- 
tion is greater than to hierarchy. Unfortu- 
nately, higher education also makes for more 
Social sophistication and inclination to social 
desirability bias, as contrasted with extremely 
Poor education. However, in the range of 
the management hierarchy education averages 
(four years high school to four years col- 
lege), the difference in social desirability bias 
1S not marked enough to interfere greatly with 
Personality test score results. 

This broad ability selection hypothesis is 
Weakest when applied to small businesses, 
where the owner starts at the top through 
Unusual initiative, and gradually builds a 
terarchy under him. It fits best the large, 
ge business where promotion is from 
and mi the firm, from the bottom to the top 

e hierarchy is well structured. 


Summary 
i The 


j Present study achieved one of its ma- 


Or purposes in showing that a 678-case, 1955- 
ates Fe ge of management appraisal candi- 
Sonali ad the same observed trends in per- 
aty test scores with hierarchy as did a 
study i, 1949-1952 sample. The 1955-1957 
¥ utilized data from the EQ-C, a revision 
The ee utilized for the 1949-1952 study. 
in ys Served trends for personality test scores 
ng: were the same for both studies 
Socia] pa to the closely correlated scales of 
etai] Cominance, a marked upward trend: 
Justme a marked downward trend; and ad- 
Tend a Psychological, a mild downward 
nificant lso, F tests on hierarchy were sig- 
bject; for these scales, In both studies. 
hier Wity and social extroversion showed no 
archy trends and had no F-test signif- 
Only drive and emotionality were 
nd Drive irritability was negative in 
Ro.» 2d had no F-test significance on the 
test 2 @Md was negative in trend and had F- 
haq SIgnificance on the EQ-C. Emotionality 
treng “test significance and was negative in 
On the EQ-B, and did not have signifi- 


219 


cance and was negative in trend on the EQ-C. 
It was emphasized in the discussion, that an 
analysis of variance F test is not absolute 
proof of trend and that the trends presented 
were merely observed. The F test supports 
the probability of the observed mean differ- 
ences not being due to chance. 

In addition, nine new scales, developed for 
the EQ-C after the EQ-B study, were ob- 
served for hierarchy trend. Of these, inde- 
pendent achievement and judgment and deci- 
sion showed the most marked, positive trends 
and had the greatest F test significance; and 
recognition anxiety had the most marked, 
negative trend and F test significance. Of 
the other five new scales, social consideration, 
adjustment somatic, psychopathic tendencies, 
and personal achievement showed mild trends 
with hierarchy, all negative except for social 
consideration, and all had F-test significance. 
Compensatory achievement and drive persist- 
ence showed no trends with hierarchy sup- 
ported by F-test significance. Of all the scale 
trends on hierarchy of the EQ-C, only inde- 
pendent achievement motivation was not re- 
versed at Hierarchy Level I. It was the only 
perfect trend and the only trend to be greater 
in magnitude with hierarchy than with edu- 
cation. 

EQ-C scale trends of an even more marked 
nature, but in the same direction as for hier- 
archy, were noted for education. Opposite, 
but lesser trends were observed for age, ex- 
cept for detail where the trend direction was 
the same. Neither education nor age trends 
in EQ-C scales were tested for support by 
analysis of variance. Education had observ- 
able trends in all 16 scales, but hierarchy and 
age did not. Also, there was a marked trend 
for actual age and education to increase with 
hierarchy. Again, these trend observations 
were not supported by analysis of variance. 

Part of the trend reversal for Hierarchy I 
was hypothesized to be accountable for in 
terms of the reversal of the Hierarchy Level I 
group from the actual education trend in hier- 
archy. This allowed the negative influence of 
the age trend to continue and the positive 
education trend to be reversed for Hierarchy 
Level 1, EQ-C scale averages. . 

\ double classification control of a new 
type, for independence of paired variables, 


220 


carried out by counting changes in the trend 
direction and computing the probability of 
these occurring by chance, was conducted. 
This indicated that none of the hierarchy 
trends was demonstrably independent of edu- 
cation when education was controlled. Five 
scales showed hierarchy trends when age was 
controlled. Six scales showed education trends 
when hierarchy was controlled, and four scales 
showed age trends when hierarchy was con- 
trolled. These double classification studies 
were held to be only rough indicators of in- 
dependence of trends because the counting 
method of measuring trend did not give any 
weight to the V in each Category, except to 
arbitrarily specify that V must be a certain 
magnitude to be counted. 

Brief reference was made to additional S. J: 
& H. studies, now in process, of the influence 
of social desirability bias on the EQ-C and 
the Edwards Personal Preference Schedule, 
and of intelligence test score trends in the 


€ were men- 
ypothesis that 
not primarily 
bias. 


Henry D. Meyer and Alan J. Fredian 


lection is by education. Management hioi 
archy trends in personality scale scores a 
merely an accident of the education and a 
composition of the hierarchy and only aA 
pendent achievement is a truly independe 
personality predictor of hierarchy level. a 
A second general hypothesis was sugges il 
contrary to the first: that personality sa 
trend results in hierarchy are a EES 
actual hierarchy composition in which E 
able people in intelligence, personality, a 
independent achievement motivation win to 
with age and experience in the compt in 
ascend the management hierarchy an’ 
earlier years, to achieve higher education. 


Received December 31, 1958. 
Early Publication, 


References 


Gg sY- 
Adler, A. The practice and theory of individual A 
chology, New York: Harcourt, Brace, 192! ble in 
Edwards, A, L. The social desirability ee ip 
personality assessment and research. New 
The Dryden Press, 1957. (a) Sched- 
Edwards, A. L, Edwards Personal Preference rpora- 
ule, Manual, New York: Psychological Co 
tion, 1957, (b) ” 
McClelland, D, C., Atkinson, J. W., Clark, R New 
Lowell, E. L, The achievement motive- 
York: Appleton-Century-Crofts, 1953. lity test 
Meyer, H. D, & Pressel, G. L. Personali appl 
scores in the management hierarchy. : 
Psychol., 1954, 38, 73-80, a short 
Rothe, H. F. Use of an objectivity key on l. Psy” 
industrial Personality questionnaire. J. ap” 
chol., 1950, 34, 98-101. 4th 
Snedecor, G, W. Statistical methods. ( 
Ames, Iowa: Iowa State Coll. Press, 1946. 


n ed) 


Journal of Applied Psychology 


VoL. 43, No. 4 


AUGUST, 1959 


DIFFERENTIAL PERCEPTION OF CERTAIN JOBS 
AND PEOPLE BY MANAGERS, CLERKS, AND 
WORKERS IN INDUSTRY ' 


HARRY C. TRIANDIS ° 


Cornell University 


„The recent development of the semantic 
ifferentia] by Osgood and his associates (Os- 
soa, Suci, & Tannenbaum, 1957) has pro- 
fenik, a procedure of great simplicity and 
B ibility for the study of the frames of refer- 

ce of industrial subjects. Ss are required 


to rg : $ 

on” a series of concepts on a series of 
és es. The means of the ratings for a given 
fr UP provide information about the group 


ame of reference, Weaver (1958) compared 
© meaning of 10 concepts for members of 
sagement and labor leaders and found sig- 


ni . A 
eee! differences in the meanings of the 
cepts “the closed shop,” “grievance,” “the 


“labo Movement,” “working during a strike,” 
tween In politics,” and other concepts, be- 
describes two groups. The present study 
Or t es the use of the semantic differential 
People Study of how certain jobs and certain 
are perceived by various groups of 


Ndustrial Ss. 


T Method 
CChnical Note 


e004 and his associates used mostly college stu- 
tic diffe ae The writer’s attempt to use the seman- 
An, tential with workers suggested that these Ss 
Combin, extremely difficult to respond to “unusual” 
ateq alons of concept and scale (e.g, Joe Dow 
differenti angular-rounded). For this reason, special 
The tials for jobs and for people were developed. 
Focedure for the development of the differen- 
in ‘ 
doctors Paper is based on portions of the writer’s 
TES the station, The author gratefully acknowl- 
« Rygeuidance and help of W. W. Lambert, 
hich thi and W. F. Whyte. The larger study, 
1S is a part, was supported by a grant 
oundation for Research on Human Be- 


“Now 
at the University of Uinois. 


221 


tials was as follows: First, 12 triads of jobs and 12 
triads of people were presented to 105 industrial Ss 
(20 workers, 30 male clerks, 30 female clerks, and 
25 managers). The Ss were asked: “Which one of 
these three jobs (people) is different from the other 
two and why?” (e.g. triad: teacher, welder, clerk. 
Response: teacher is professional, or welder is manual, 
or clerk is routine worker). “What is the opposite 
of this characteristic?” (e.g, unprofessional, or 
white collar, or variable). The lists of the character- 
istics obtained from the various groups of Ss dif- 
fered from each other. An analysis of these lists has 
been published elsewhere (Triandis, 1959). A strati- 
fied random sample of the characteristics so obtained 
constituted 28 scales of cach of the semantic differ- 
entials. An additional 10 scales were selected from 
Osgood, Suci, and Tannenbaum (1957), so as to 
represent the seven factors of their semantic differ- 
ential factor analysis. The sheets of paper used for 
this test could accommodate only 38 scales. The 
differentials and the instructions that were finally 
used may be found in Triandis (1958, pp. 296-298). 


Procedure 


The two 38-scale semantic differentials, one for 
jobs and one for people, which were developed as is 
described above, were administered to 156 Ss. Usable 
answers were received from 5 members of the com- 
pany’s executive committee, 14 department managers, 
18 section managers, 32 female clerks, 28 male clerks, 
and 55 workers. The Ss rated five jobs (welder, 
teacher, vice-president, personnel director, and clerk) 
in counterbalanced order on the semantic differential. 
In addition they rated their supervisors, the com- 
pany’s personnel director, the boss of their super- 
visor, the vice-president of their division, a “fellow 
at work whom you like,” and “an effective manager 
you have known well and who is not the same as 
any of the people already rated.” 


Results and Discussion 


The means of the ratings of the various 
jobs and people on the 38 scales of the seman- 
tic differential, for groups consisting of upper 


222 


managers, lower managers, female clerks, male 
clerks, and workers, can be found elsewhere 
(Triandis, 1958). Limitations of space do 
not permit complete presentation of the find- 
ings. A general observation, however, is pos- 
sible; the means of the various groups on 
most of the scales of the differential are very 
similar. Against this background of simi- 
larity, however, it is possible to note several 
important differences. 


Differences in the Perception of People 


A comparison of the two “ideal” concepts, 
the workmate and the manager, reveals great 
consistency between the groups and between 
levels of each of the divisions. Both ideals 
are considered successful, though the manager 
is a little more successful than the workmate. 
The ideal manager is purposeful, while the 
workmate does not have this trait. Both are 
easy to understand, stable, educated, kind, 
ambitious, though the manager is a little more 
ambitious than the workmate; gracious, 
though the workmate a little more than the 
manager; receptive, capable, active, colorful, 
cooperative, original, though the manager a 
little more than the workmate; experienced, 
young, friendly, intelligent, aggressive, though 
the manager a little more than the workmate; 
skilled, progressive, powerful, though the man- 
ager a little more than the workmate; so- 
ciable, very concerned with public relations, 
like traveling, good-humored, self-made, and 
just slightly emotional. The workmate’s pay 
is average, the manager’s high. The manager 
is more talkative. The workmate does not 
have too many headaches on the job, the man- 
ager does. Finally, the workmate is more 
satisfied than the manager; the latter is at 
times very dissatisfied. 


The Characteristics of the Successful and 
Unsuccessful Supervisors 


In this section we will undertake to answer 
the following question: Suppose there are two 
department heads who are considered as suc- 
cessful and relatively unsuccessful respectively 
by their supervisor (a vice-president). How 
are these two men perceived by their subordi- 
nates? Let us call them Mr. Effective (£), 
and Mr. Ineffective (I). Both are perceived as 


being successful. E is perceived to be more 


Harry C. Triandis 


purposeful than I. But I is casier to wil 
stand. Both are stable, educated, aimbe 
strong, capable, active, quiet, colorful, i 
operative, conventional, get high pay, wee 
enced, young, intelligent, satisfied, cath. 
fast, skilled, progressive, powerful, soit E 
and share most of the other characteristics 
equal amounts, yet E is crude while I = f; 
cious, E is assuming, and I unassuming, „diy 
stubborn, and I is receptive, I is more i an 
and good humored, E is more unfriendly a 
bad humored. In short, I is closer to the re 
ture of the ideal manager of most 8'0 P 


e 
ented. We might conclude, then, that t 
particular vice-president is more task-0T} 
than employee-oriented. Three out 
of the vice-presidents of this company 
similarly task-oriented. Jes 
Another comparison considered the pel 
of 11 department heads who are Lene 
and 3 department heads who are disli a 
their subordinates. The disliked depa (lt 
heads were perceived as being more “ 
to understand, cruel, crude, stubborn gtis 
Operative, inexperienced, unfriendly, @ 
fied, and assisted. 


the 
The Meaning of the Similarity Betwee" 


Perception of the “Actual” and the 
“Ideal” Supervisor 


ose 
jn 
people who perform their role in socie the 
such a way that their behavior approas rolê: 
ideal expected behavior for the particu o tes 
as the latter is perceived by the SS: sof h 
this view, the similarity in the prof pelate 
actual and the ideal supervisors was n o 7 
to ratings of liking of the supervisi 5 Bo 
Thurstone-type successive intervals “file si 
wards, 1957, pp. 120-145). The Pt° 

larity was computed as follows: 


Le D 
A ee 
Se= Lg + See roel 
: per rs 
where d is the difference between the ervi? 


) 
tion of the actual and the ideal SU! 


Perception of Jobs and People in Industry 223 
Table 1 
D Matrix for the Perception of Five Jobs by Five Groups * 
Top Low Female Male 
Mgrs. Mgrs. Clerks Clerks Worker 
Distance Between (N =17) (N = 25) (N = 32) (N = 33) (N = 56) 
Welder’s Job and Clerk's Job 62 55 48 59 51 
Welder’s Job and Teacher's Job 64 56 68 63 50 
Welder’s Job and Personnel Director's 68 60 69 63 61 
Welder’s Job and Vice-President’s Job 76 65 74 73 66 
Clerk’s Job and Teacher’s Job 66 54 63 64 54 
Clerk’s Job and Personnel Director's 69 55 60 60 51 
Clerk’s Job and Vice-President’s Job 84 66 70 75 58 
Teacher's Job and Personnel Director's 37 36 58 40 46 
Teacher’s Job and Vice-President’s Job 44 41 50 49 45 
Personnel Director's and Vice-President’s Job 37 35 45 43 45 


* All distances adjusted so that the same N (N = 17) is used throughout. 


X one of the » scales of the semantic differ- 
oa D is the same as the D-statistic used 
eS gaol et al, and others. In our case 
fto 8. The 36 is a constant which comes 
m the fact that a 7-point scale was used. 
L T Pearson r coefficients between S, and 
Por rigs for the supervisor) were as follows: 
< 000 managers and top clerks r= .73 (p 
aud f 1); for 50 clerks r = .58 (p < .001); 
cohok 50 workers r = .54 (p < .001). We 
Gas e that our hypothesis about the rela- 
likina © Of “ideal” and actual behaviors to 
ng is confirmed. 


Di > 
‘Berences in the Perception of Jobs 


Se is a tendency for the workers to per- 
ence : Welder’s Job as involving more experi- 
Shons mora being more desirable, important, re- 
tive, cy, €, alert, difficult, professional, execu- 
as “a skilled, and doing more things, 

er as less routine, as compared to the 
eney area In other words, there is a tend- 
job, ee or overevaluate, this factory 
Mana his suggests that any tendency of 
this — to minimize the importance of 
finain will be perceived as offensive. This 
Whyte is consistent with a case study by 
Vice-pr (1956) in which it was shown that a 
job “oes remark that a certain skilled 
furiati S “just a watchman’s job” was so 1n- 
Cr Mg to the workers that they joined the 
ìn large numbers. The findings sug- 


Rest p 
that management ought to consider the 


tendency of workers to value their jobs more 
than management values them, in its com- 
munications to them. In the case of clerks, 
rating a Clerk’s Job, however, the data did 
not reveal any tendency towards overevalua- 
tion. 


v 
v 
T 
P 
Lá 
c 
c w w 


17 Upper Managers 25 Lower Managers 


v 
< i 
w 


56 Workers 


32 Female Clerks 


Frc. 1. Relationships between five jobs as viewed 
by different groups. 

Note—V stands for Vice-President’s Job; P, Per- 
sonnel Director’s Job; 7, Teacher’s Job; C, Clerk’s 
Job; W, Welder’s Job. 


224 


Osgood et al. (1957, p. 244 ff.) have de- 
scribed how the data from the semantic dif- 
ferential can be used to compute distances 
between concepts. The greater the D-sta- 
tistic, or distance between two concepts, the 
more different the two concepts seem to be to 
the Ss doing the judging on the semantic dif- 
ferential. Osgood’s procedure was used to de- 
termine the distances between the five jobs 
studied in our field project. The D-matrix 
is shown in Table 1. Two-dimensional draw- 
ings of the three-dimensional forms con- 
structed from the D-matrix are shown in 
Fig. 1. 

The only major difference in the perception 
of the five jobs from group to group, as re- 
vealed by Fig. 1, is in the position of a Clerk’s 
Job relative to the other jobs. The workers 
view it in rather “exalted” terms, the top 
managers see it as the most different job as 
compared to the prestigeful Vice-President’s 
Job. Perhaps the large number of dissatis- 
fied clerks in the particular company is due 
to this perception of top management. Ap- 
proximately 30% of the clerks dislike their 
supervisors; this percentage is quite high for 
this company. 

It is interesting to notice that the most 
meaningful way to represent the job percep- 
tions of these groups is to draw a perpendicu- 
lar line between a Welder’s and a Vice-Presj- 
dent’s Job. It means that the most signifi- 
cant variable in the Perception of jobs is the 
level of the job. 

It is also interesting to note that the man- 
agers make finer discriminations between jobs 
than do the workers, This is seen from the 
fact that the Ds obtained from the top man- 


agers are consistently higher than the Ds ob- 
tained from the workers, 


The Perception of the Man and the Job 


To what extent does the man d 
perception of a job, and the job 
tion of the man doing it? This is a complex 
question. We have only made a beginning 
towards answering it, but we did collect some 
data that are interesting. 

The 155 employees who participated in the 
study rated Mr. T., the Personnel Director, 
also the Personnel Director’s Job on the se- 


etermine the 
the percep- 


Harry C. Triandis 


mantic differential. The reader will E 
that the semantic differential for people a 
the one for jobs were not the same. Nevai 
theless, we did have a few scales which wer 
equivalent. 


Job S.D. Scales People S.D. Scales 


Evaluative j 
Successful-unsuccessf 
Educated-uneducatec 


High-low position 
Requires much-no 
education 
Requires much-no 
experience 
High-low pay 
Sociable-unsociable 
Requires much-no 
intelligence 
Clean-dirty 
Good-bad 


-experienced 
Exyperienced-inexperien 
Gets high-low pay 
Sociable-unsociable -i 
Intelligent-unintellige 


Gracious-crude i 
Good-bad humore¢ 


Potency Scales 


Weak-strong 
Powerful-powerless 


Heavy-light 
Tmportant-unimportant 


Activity Scales 
Active-passive Active-passive 
re 
Osgood et al. (1957, p. 91) D-scores eas 
obtained from the discrepancies in the ader 
ings of these corresponding scales. The = i 
will notice that the scale correspondent’ n- 
not very close. The D-scores that were r re 
puted were very small. This is a ro 
markable in view of the very rough “ears 
spondence between the scales. It ag 
that the perceptions of the job and me ace 
are very closely interrelated. The Pi the 
quires the characteristics of the man an 
man the characteristics of the job. Osgood 
From consideration deriving from he se 
et al. (1957), on the reliability of for o 
mantic differential, we conclude that, means 
particular case, a D smaller than 48 ad 
that there is complete fusion of the Ja 
the man. For 80% of the top manas? + the 
62% of the middle management, 6870 é 
lower Management, 47% of female of th? 
58% of the male clerks, and 56% ion- 
workers there was such complete fusi was 
A reasonable hypothesis, it seeme pene? 
that if a person had never had any a he 
with personnel men, in other words whi” 
never worked before in a setting i” 


Perception of Jobs and People in Industry 


there was a personnel man, this fusion be- 
tween job and man doing the job would be 
even more complete. Surprisingly, this hy- 
Pothesis cannot be supported by the data. In 
fact, if there is a relationship it runs in the 
Opposite direction (p < .25, chi square, two- 
tail test). 


Summary 


Five jobs and 6 people were rated on 38 
Scales of corresponding semantic differentials 
by 156 Ss representing various groups in in- 
dustry, The differences in the perception of 
the jobs and the people are discussed. 


Received January 12, 1959. 


w 
N 
Ui 


References 


Edwards, A. L. Techniques of attitude scale con- 
struction. New York: Appleton-Century-Crofts, 
1957. 

Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. 
The measurement of meaning. Urbana: Univer. 
Illinois Press, 1957. 

Triandis, H. C. Some cognitive factors affecting 
communication. Unpublished doctoral disserta- 
tion, Cornell Univer., 1958. 

Triandis, H. C. Categories of thought of managers, 
clerks, and workers about jobs and people in in- 
dustry. J. appl. Psychol, 1959, 43, in press. 

Weaver, C. H. The quantification of the frame of 
reference in labor management communication. J. 
appl. Psychol., 1958, 42, 1-9. 

Whyte, W. F. Engineers and workers: A case study. 
Hum. Organization, 1956, 14, 3-12. 


wurnal of Applicd Psychology 
ea 43, No. 4, 1959 


EFFECTS OF ALTERING TASK COMPONENTS ON 
PERCEPTUAL-MOTOR TASK LEARNING? 


JOHN A. WHITTENBURG,*? SHERMAN ROSS, anv T. G. ANDREWS 


University of Maryland 


The rapid increase in the number and com- 
plexity of man-machine systems in contempo- 
rary military and industrial operations has 
demanded new attention to the problem of 
training human operators. Among other basic 
aspects of the training problem is a require- 
ment for determining the skills involved in 
the learning of complex perceptual-motor 
tasks. Knowledge of these skill requirements 
can contribute both to the development of 
efficient training procedures and to the de- 
velopment of effective task simulators to be 
used during the initial stages of training. 

One fundamental problem is the relative in- 
fluence of different task characteristics on the 
learning of complex tasks in man-machine 
systems. The purpose of this study is to 
assess the effects of altering operational com- 
ponents of a perceptual-motor task on the 
acquisition and retention in early and later 
stages of learning. The following general hy- 
pothesis was formulated: The acquisition and 
retention of various perceptual-motor skills 
can be influenced differentially by altering 
different display-control relationships of the 
same task and by introducing those altera- 


tions at different stages in the learning of that 
task. 


Several investi 
ceptual-motor ta 
duction of chan 


gations of learning on per- 
sks indicate that the intro- 
ges in a task produced posi- 
tive or negative transfer effects, and that these 
effects are in part a function of the level of 
learning attained prior to the introduction of 
the “new” task (Andreas, Green, & Spragg, 
1953; Duncan, 1953; Lewis, McAllister, & 
Adams, 1951; McAllister, 1952), Previous 
studies also showed that difficulty differences 
between tasks often resulted in differential 


1 This study is one of a series on 
ment supported by the Research Di 
The Surgeon General, Department of 
tract No. DA-49-007-MD-222 to Th 
Maryland. Pon 

2? Now at Human Sciences Research, Inc., Arling- 
ton, Va. 


behavior decre- 
vision, Office of 
the Army, Con- 
e University of 


transfer effects (Adams & Lewis, oi 
Bilodeau, 1952; Jones & Bilodead, n ther 
An open question from these results is whe be 
or not the differential transfer effects m 
attributed in large part to variations in a 
difficulty between the original and the He 
task rather than to changes in the task € ea 
acteristics. In addition, most studies com- 
with the effects of altering single task Jated 
ponents on the learning of the inter as ; 
or “new” task (Carter, 1947; Gibbs, tisfac- 
Helson, 1949). Since we have no sê pro- 
tory way to equate the level of learn ng pro- 
duced by X trials on one task with o sys 
duced by Y trials on another task, litt ~ 0 
tematic knowledge is available in reien 
the effects on the learning process of tages 
different task components at different 
in learning, 
In order to test the above hypothe atio 
to provide a method for gaining aise 
regarding the relative influence of ‘races! 
task characteristics on the learning Fpuovs: 
a perceptual-motor task requiring awe wa 
Compensatory control-display interaction at 4 
developed. This task was designed 5° s 
number of task components could be du 
matically and immediately varied f 
learning without changing the nature S - 
culty level of the task. These task eac i 
nents were experimentally varied a their ef- 
two stages of learning to determine 
fects on acquisition and retention. 


sis and 


mp? 
0 


Method and Procedure 


in brightness and in size to 4-in. when vi $ i 
The “on target” position shifted randon ‘pi? 
along the vertical axis during a given aoa it to 
was required to locate the on target posi d 


nts $9 ce 
make appropriate compensatory mon disp!” 
main on target. In addition to the rando 


226 


e—a nn a LL, I ee 


Perceptual-Motor Task Learning 


ment of the on target position during a given trial, a 
cam device driven by a l-rpm motor resulted in 
variable changes in the direction and amount of 
Vertical movement of the target indicator. The 
maximum vertical displacement of the target indi- 
cator by the cam device was 2.9 cm. above and be- 
low the center of the face of the oscilloscope. S was 
required to compensate for the cam driven vertical 
Movements as well as to shift continuously the po- 
sition of the target indicator to stay on target. To 
Perform this task, S was provided a 2-in, knob 
mounted horizontally on a shaft and positioned di- 
rectly below the face of the oscilloscope. Turning 
the knob in both directions and in varying amounts 
altered the direction and extent of movement of the 
target indicator. A trial lasted 1 min. and was ter- 
minated by a micro-switch operated by the cam. 
Another trial started when E depressed a contact 
Switch which activated the 1-rpm motor. The cri- 
terion measure of performance was time on target 
recorded by a chronoscope sensitive to 01 sec. The 
nronoscope was activated only when the S was on 
tin and the time was cumulated over each 1-min. 
Task components. Three major functional com- 
te ek of perceptual-motor tasks were selected for 
Diva to be learned: (a) directional relationships 
an = control changes and display changes; (b) 
p ay change relationships between control and dis- 
ships peeentation, and (c) relative torque relation- 
oer a control and display movements. These 
fie aes components may be characterized as form- 
tual gral parts of the operation of many percep- 
“Motor tasks, 

p — display-control directional relationship was im- 
na by linking a horizontal movement of the 
M PA knob to a vertical movement of the target 
the E Using a switch to change the polarity in 
ship cuit instantly reversed the directional relation- 
. he display-control rate relationship was im- 


227 


plemented by selecting certain values expressing the 
relationship between amount of control movement to 
the amount of target indicator movement. Two re- 
sistors of appropriate values were placed in the cir- 
cuit and use of a toggle switch permitted the two 
rate relationships to be altered instantaneously by E. 
To instrument the display-control torque relation- 
ship, a magnetic fluid clutch was coupled to the con- 
trol shaft by means of gears. The relationship be- 
tween control knob movement and target indicator 
movement was accomplished in terms of differential 
torque in the control system to the extent that S 
was off target. The farther S was off target, the 
greater the amount of torque present in the control 
movement, To vary the torque values a variable 
resistor was placed in the circuit, A knob attached 
to the resistor operated by E permitted a rapid se- 
lection to be made between the two torque values. 
There are eight separate tasks possible when the 
three display-control components are varied in each 
of two ways. To facilitate discussion of these tasks, 
a coding scheme was constructed to illustrate the 
different tasks in terms of their distinct display-con- 
trol characteristics (see Table 1). These eight tasks 
were used in a pilot study to answer specific ques- 
tions asked below and four of these task variations 
were used in the main experiment. 


Pilot Study 


Purpose. A pilot study was done to ob- 
tain information regarding two requirements 
for the perceptual-motor task: namely, sig- 
nificant practice effects in one experimental 
session which would permit a meaningful dis- 
tinction to be made between the early and 
later learning stages, and comparability in 
difficulty between variations in the task com- 
ponents. 


Table 1 


Scheme for Coding the Display-Control Variables 


soe Display-Control Variables 


1. Rate Variable 
a. 1-1 Ratio (Control to Display) 
b. 1.5-1 Ratio (Control to Display) 


2. Direction Variable 
a. Control Clockwise Direction 
Display-Upward Direction 
b. Control-Clockwise Direction 
Display-Downward Direction 


3. Torque Variable 


a. Maximum of 4.5 in. 02. 
b. Maximum of 11.5 in. oz. 


Coded Tasks 


Code 

Ra TaDaRa 
Rb TaDaRb 

TaDbRa 
Da TaDbRb 

TbDaRa 
Db TbDaRb 
Ta TbDbRa 


Th ThDbRb 


228 


Subjects and procedures. The data for 
both the pilot study and the main experiment 
were collected at the U.S. Army Aberdeen 
Proving Grounds in Aberdeen, ; Maryland. 
The sample came from a population of male 
soldiers who had just completed basic train- 
ing and were awaiting reassignment to vari- 
ous schools for specialized training. The sam- 
ple was drawn from a Casual Company lo- 
cated on the base. Eighty Ss were used in 
the pilot study, 10 Ss for each of the eight 
task variations shown in Table 1. Each S$ 
was given 20 one-minute trials on the task. 
There was 25 secs. between trials. Immedi- 
ately preceding each trial a small light was 
turned on above the face of the oscilloscope 
which informed S that 3 secs. later E would 
say “start” and the trial would begin. Each 
trial began with the target indicator in the 
on target position. 

Pilot study results. Determination of 
whether significant practice effects did occur 
in 20 trials for the eight task variations was 
made as follows: To stabilize the time per 
trial estimates for a statistical analysis of 
practice effects, the trials were grouped in 5 
blocks of 4 trials each. Thus, total time 
on target scores for each § was summated 
through 4 trials at a time. The analysis of 
time on target data was performed by a “re- 

peated measurements” analysis of variance 
technique presented by Edwards (1950). The 
results of the statistical analysis demonstrated 
highly significant practice effects for all task 
variations, 

The second question concerned the com- 

parability in difficulty among the eight task 
variations. For statistical analysis of the 
data, total time on target scores for each S 
was summated through all 20 trials and 
treated as the score. The statistical analysis 
was made by using the same repeated meas- 
urements analysis of variance technique men- 
tioned above. The results demonstrated no 
significant differences among the task varia- 
tions with regard to over-all ease of learning 
as measured by total time on target. On the 
basis of the results obtained from the pilot 
study, the requirements of the task for sig- 
KiftGant practice effects in one experimental 
session and comparability in difficulty among 
the eight task variations were satisfied. 


J. A. Whittenburg, S. Ross, and T. G. Ross 


Main Experiment 


Selection of task components. One Me 
was randomly selected as the standard P 
on which each S initially practiced and an 
experimental tasks were selected so that nie 
involved one variation of the three Tre 
ponents found in the standard task. ding 
standard task had the display-control co oe 
TaDbRa: a maximum torque of 4.5-in. ae 
a control movement in the clockwise wove 
and the display movement in the bas ‘a 
direction, and a 1 to 1 ratio of ees ks 
display movement. ‘The experimental me 
included TbDbRa (manipulation of bee ES 
TaDaRa (manipulation of the directions 3 
lationship), and TaDbRb (manipulatio 
the rate relationship). The pat dudy 78 


2 : identifyin 
sults provided a general basis for ident TE 


pitial 
fter 
ere 


that 
later learning condition. Tt was assume enta 


five interpolated trials on the experi the 
tasks would be sufficient to test the PYP ould 
sis. We believed that this arrangement 5 few 
reduce the possibility of having either the bY” 
trials to provide an adequate test of in pos 
pothesis, or too many trials resulting 5 sk 
sible overlearning of the interpolate 953): 
conditions (McGeoch, 1952; Osgood, on 
To assess the effects of altering task T he 
nents at different stages of learning i el 
acquisition and retention of the task red ndi- 
skills, the number of trials for both ase P 
tions was extended to 35. This incre? qasi 
the number of trials was designed t° in 


ain? 
ditions using the performance data E co 
on the last few trials. Based on the — for 
siderations, an experimental desig? v ple 2): 
mulated to test the hypothesis (see = graw” 

Subjects. One hundred and eight - ul 
from the same population as in the P! 


7 


Perceptual-Motor Task Learning 


wN 
w 
oO 


Table 2 


The Experimental Task Variations During Early and Later Learning Stages 


Trials 
Eiet 6...10 Ll 24.15 16...20 r) E 
Early Learning Standard TbDbRa Standard Standard Standard 
TaDaRa 
TaDbRb 
Control 
Later Learning Standard Standard Standard Control Standard 
TbDbRa 
TaDaRa 
TaDbRb 


Were used. Ss were randomly assigned to 
ai conditions: 48 to the early learning, 48 
pot Ni learning, and 12 to the standard 
oe „The 48 assigned to both early and 
dom] earning conditions were assigned ran- 
Bro Y to the three experimental groups. One 
oe of 12 Ss was designated as a control 
dú UP, consisting of no practice on the task 
ring the interpolated period. The Ss in 
ae a o group were given magazines to 
that Me the equivalent amount of time 
e experimental groups performed on 
wane task conditions. | There were 35 
With Pa each §. Each trial lasted 1 min. 
Was secs. between trials. Time on target 
used as the primary measure of learning. 


Results 


ion mbarability of Ss. An initial determina- 
assigned made of the comparability of the Ss 
feel cc, to the standard, control, and the 
and ee nl tasks for both the early 
Sificat; er learning conditions. A single clas- 
arget on analysis of variance of total time on 
Wa Scores for the initial five trials for all 
s a made. The results indicated that the 
iffer 'gned to the different conditions did not 
è nee nily in total time scores for the 
ei on the standard task. ; 
Crease eaz of statistical analysis. To in- 
ime on e sensitivity of statistical analyses, 
five wee scores for all Ss for the first 
adjust S on the standard task were used to 
fe eee for individual performance dif- 
the a on each condition. A total score for 
St five trials was obtained for each S. 


These scores were used to adjust scores for 
testing the experimental hypothesis by analy- 
sis of covariance procedure (Johnson, 1949). 

The effects of variations in task conditions. 
This aspect of the hypothesis was tested in 
two ways. First, it was reasoned that if 
variations in task conditions differentially af- 
fected the acquisition and retention of the 
task-required skills, criterion scores on the 
standard task following the experimental 
treatments should differ significantly as a 
function of the interpolated practice on the 
altered task components. Adjusted time on 
target scores subsequent to experimental vari- 
ations of the task components were analyzed 
for both early and later learning conditions. 
To obtain relatively stable time scores, the 
total time on target scores for the 5 posttrials 
were used and an analysis of covariance was 
performed. Analysis of criterion scores for 
both early and later learning conditions failed 
to reveal significant differential effects of al- 
tering task components on the performance of 
the standard task (see Fig. 1 and 2). A fur- 
ther analysis using only the first posttrial for 
early and later learning conditions was per- 
formed. The results of this analysis showed 
differences in time on target scores that were 
significant beyond the .05 level for the early 
learning condition (see Table 3). No signifi- 
cant differences were obtained on the first 
posttrial for the later learning conditions. In- 
spection of the adjusted means for the early 
learning condition indicated that changing the 
torque characteristic facilitated performance 
on the standard task. The findings also 


230 


AV. TIME ON TARGET (SEGS.) 


(SEcs.) 


AV. TIME ON TARGET 


J. A. Whittenburg, S. Ross, and T. G. Ross 


36 
2 
a 
oe ALTERED TASKS 
32 f 
30 
28 | 
STANDARD X——X 
26 CONTROL O——O 
DIRECTION ©0——O 
RATE o---0 
TORQUE 0---#@ 
24 
ei 
a ee ___——<'s5 
-5 6-10 Weis 16-20 2-25 26-30 a 
TRIALS 
Fic. 1. Mean time on target for early learning. 
36 
ALTERED Tasks oe) 
34 
a| 
30| 
28 | 
STANDARD X—X 
26 t CONTROL @——® 
DIRECTION O- - -0 
€ RATE @--0 
TORQUE 0—0 
24 
22 
o m a atti r 7235 
1-5 6-10 1-15 16-20 21-25 26-30 3i-8 
TRIALS 


Fic. 2. Mean time on target for later learning, 


Perce ptual-Mot 


or Task Learning 


Table 3 


Analysis of Covariance of First Posttrial 
Adjusted Means of First 


During Early Learning and Unadjusted and 
Posttrial for Each Condition 


Sum of Mean 
Source of Variation df Squares Square ie P 
Between Conditions 3 31,014.00 10,338.00 3.76 <.05 
Within Conditions 43 118,117.36 2,746.92 
Total 46 149,131.36 


Adjusted Means 


Condition Unadjusted Means (Secs.) 
Standard 31.07 31.07" 
Control 30.65 27.33 
Rate Change 27.37 
Direction Change 27.73 
Torque Change 35.65 


omitted iene? the analysis. 
showed that changes in rate and directional 
characteristics led to transient interfer- 
ce effects on the subsequent standard task. 
though the hypothesis was supported by 
ee analysis, the differential effects of 
i ing task components on subsequent learn- 
E Were both transitory and evident only 
"ing the early learning condition. 
ee ther analysis was carried out on the 
eed of altering task characteristics on the 
Obtains Process. The adjusted time scores 
es on the “new” tasks during the in- 
early ated practice period were analyzed for 
ime and later learning conditions. Total 
on target for each S was used as the 


lack of homogeneity of variance between standard and other conditions on the first posttrial, this condition was 


score (see Table 4). The results showed sig- 
nificant differences for the early learning con- 
dition. The findings were identical for the 
interpolated period as for the subsequent re- 
learning trials on the standard task. Altering 
torque facilitated performance, while chang- 
ing rate and directional relationships pro- 
duced relative interference on the altered task 
conditions. An analysis of covariance on the 
interpolated task scores during the later 
learning condition failed to yield significant 
differences among the task variations. A ¢ 
test was performed between the scores made 
on the reversed display-control directional 
characteristic and the standard task during 


Table 4 


Analysis of Covariance of Total Time on 


Target Scores for Interpolated Tasks During 


Early Learning and Unadjusted and Adjusted Means for Each Condition 


Sum of Mean 
Source of Variation df Squares Square F P 
Between Conditions 3 981,847.66 327,825.53 7.47 <.01 
Within Conditions 43 1,884,287.65 43,820.64 
Total 46 2,866,135.31 
Adjusted Means 
_ Condition Unadjusted Means (Secs.) 
Standard 28.80 28.60 
Rate Change 23.44 23.50 
irection Change 24.54 24.17 
Torque Change 29.53 30.05 


Table 5 


The Average ‘Time Scores for the Last Pretrial and 
First Posttrial for the Later 
Learning Condition 


Condition Trial 15 Trial 21 
Standard 31.7 31.2 
Control 32.4 31.9 
Direction Change 33.7 30.1 
Rate Change 32.6 31.4 
Torque Change 30.9 29.4 


the interpolated period. The results were 
significant at beyond the .01 level showing 
that altering display-control directional rela- 
tionship interfered with performance on the 
reversed task as compared with performance 
on the standard task. 

Effects of altering task components at dif- 
ferent learning stages. The Previous analy- 
sis indicated that altering different task com- 
ponents during the initial acquisition of task- 
required skills produced differential and tran- 
sient effects. One further analysis was per- 
formed with the later learning condition, This 
analysis was based on the observation that 
the first posttrial after interpolation of the 
experimental tasks showed that the time 
Scores were generally lower than the last pre- 
trial scores. A correlated t test of the dif- 
ference score for all experimental conditions 
yielded a significant difference between trials 
15 and 21 beyond the .05 level. Apparently, 
the major effect of altering task components 
later in learning on this task is to produce a 
temporary decrement in performance on the 
retention of the standard task regardless of 
the task characteristic altered (see Table 5). 


Discussion 


With regard to the hypothesis of this study 
and the results obtained, the effects of alter- 
ing task components at different stages of 
learning are interpreted as follows: during the 
initial stages of learning on perceptual-motor 
tasks of this type, behavior is influenced by 
the “immediate sensory information” pro- 
vided by the display-control relationships. 
Depending on the characteristics of the task, 
responding to these changed display-control 


J. A. Whittenburg, S. Ross, and T. G. Ross 


components may either facilitate or nt 
temporarily with learning on the task. y = 
the development of task proficiency, a S 
change in the display-control elanon a 
has no appreciable effect other than a ; nt 
transient interference effect. This fan i 
interference effect may indicate nothing m 
than the time required by the operaen 
perceive and adjust to the altered a 
control relationship. In the task pee 
studied in this experiment, the display i 
trol changes were accomplished between re 
without S’s knowledge. S had to begin a task 
before becoming aware of any possible 
changes. ich 
An attempt to suggest possible factors Ee 
contributed to the differential effects of art 
ing display-control relationships on the i the 
ing process is centered on the nature 0 The 
task and its operational characteristics. tain- 
nature of this task was such that main nent 
ing fairly constant rates of control moven nid; 
resulted in higher time scores than aa 
ballistic movements. During the initial Te 0 
ing trials, there is a marked tendency ae in 
respond extensively and rapidly to pipes e! 
the display. This aspect has been (Ke 
previously with perceptual-motor tasks asure 
logg, 1946) and is also reflected in a pa e 
of the amount of control movement i i 
in the present study. The introductlo pat 
greater torque in the control system ar l 
ently acted to reduce the extent of ae 
movements. Thus, 5 might have ne in 
that slower Controlled movements rest n tat 
greater ease jn “getting and staying se rate 
get. On the other hand, increasing t move 
relationship between amount of control re 
ment to target indicator movement ar ently 
versing the directional relationship appar istic 
enhanced the initial tendency to ba up 
movements and oscillation behavior. condi 
Posedly, if the rate experimental a i 
tion had required a lower rate tele i 
between control and display, this wou! crea 
tended to produce the same effects aS oe jon 
ing the amount of torque. One impli, at 
of this interpretation is the hypothesi? 
combining several variations in the task ping 
tend to reduce the initial “overcontro r 
and ballistic movements should lead F$ pand 
pronounced facilitation, On the othe! 


a 


A 


Perceptual-Motor Task Learning 


combining task variations that tend to en- 
hance this initial behavior should lead to more 
Pronounced interference effects. Systematic 
Combination of different task characteristics 
which compete in tending to dampen or en- 
hance this initial control movement behavior. 
and analysis of the effects on subsequent ac- 
quisition on the standard task may provide 
a More sensitive technique for determining 
the relative influence of different task charac- 
teristics on the acquisition and retention of 
task-required skills. This procedure may 
‘ave potential research value when consider- 
ing that the selection of techniques for study- 
ig various facets of behavior efficiency is 
one of the more difficult design problems fac- 
mg the researcher in this area (Andrews & 
Ross, 1955), 

he Tesults of the experiment permit the 
ding two conclusions to be made: (a) 
Veet the early learning stage on a com- 

“Satory perceptual-motor task, the task-re- 
ee skills are more affected by altering 
ic Components than during the later learn- 
Dello and (b) the effects of altering dif- 
i it task components on the acquisition of 
Xe task-required skills appear to be a func- 


t EA 
ba of the Operational characteristics of the 


Summary 


The Problem was to determine the effects 
ation po wating different display-control re- 
€ a Ips of a perceptual-motor task on 
Skills quisition and retention of task-required 
» N early and later stages of learning. 

Preliminary experiment permitted the de- 
ination of the learning characteristics of 
rent task components and the compara- 
Y in over-all performance achieved among 
e op Ponents, Ten Ss were assigned to 
Contro] eight tasks in which three display- 
Wa: relationships were varied each in two 
f Na hese tasks involved variations in dis- 
dife, trol directional, rate of change, and 
for Saee torque relationships. Each S per- 
lasteq On a task for 20 trials. Each trial 
N min., with 25 secs. between trials. 
Hes target was the primary measure of 
Ctic ance. Results indicated significant 
dem os effects for all tasks. All tasks were 
Strated to be comparable in difficulty. 


erm: 
iffe 
ilit 


e 


233 


The main experiment involved selection of 
one task as the standard task with three ex- 
perimental tasks selected so that each had 
only one display-control characteristic differ- 
ent from the standard task. Nine conditions 
were investigated, and 12 Ss were assigned to 
each condition. The display-control charac- 
teristics were manipulated in the early and 
later learning stages on the standard task. 
Ss were divided into two groups, one for early 
and one for later learning. Each learning 
stage condition consisted of three experimen- 
tal groups and one control group. Another 
group of 12 Ss performed all trials on the 
standard task. A total of 35 trials of 1-min. 
duration and 25 secs. between trials was given 
to each S. For the early learning condition, 
each S practiced for five trials on the stand- 
ard task, then for five trials on one of the ex- 
perimental tasks or on the control condition. 
Then all Ss practiced on the standard task 
the remaining 25 trials. For the later learn- 
ing condition, all Ss practiced on the stand- 
ard task for 15 trials, then for five trials on 
either the experimental tasks or control con- 
dition, and the remaining 15 trials on the 
standard task. Time on target was the pri- 
mary measure of performance. 

Results showed that altering different dis- 
play-control relationships early in learning 
produced transient facilitative and interfer- 
ence effects on subsequent learning of the 
standard task. Changing these characteristics 
during later learning depressed the first post- 
trial scores with respect to the last pretrial. 
It was concluded that different display-con- 
trol relationships differentially facilitate or in- 
terfere with the learning process during the 
initial learning stage and that the nature of 
the effect is related to the operational charac- 
teristics of the task. 


Received September 22, 1958. 


References 


Adams, J. A., & Lewis, D. The evaluation of “diffi- 
culty of task” under several different conditions of 
performance on the modified Mashburn apparatus. 
USN Spec. Dev. Cent. Tech. Rep., 1949, SDC 57- 
2-8. 

Andreas, B. G., Green, R. F., & Spragg, S. D. S. 
Transfer effects in perceptual-motor performance 
as a function of practice. Amer. Psychologist, 
1953, 8, 312. 


Andrews, T. G., & Ross, S. 
studies of behavior efficiency. 
Univer. Maryland, 1955. 
MD-222 [0.1. 19-52].) 

Bilodeau, E. A. Transfer of training between tasks 
differing in degree of physical restriction of im- 
precise responses. USAF Hum. Resour. Res. Cent. 
Bull., 1952, No. 52-40. 

Carter, L. F., & Murray, N. L. A study of the most 
effective relationships between selected control and 
indicator movements. In P. M. Fitts (Ed.), Psy- 
chological research on equipment design. Wash- 
ington: U. S. Government Printing Office, 1947. 
(AAF Aviat. Psychol. Prog. Res. Rep. No. 19.) 

Duncan, C. P. Transfer in motor learning as a func- 
tion of degree of first-task learning and inter-task 
similarity. J. exp. Psychol, 1953, 45, 1-11. 

Edwards, A. L. Experimental design in psychologi- 
cal research, New York: Rinehart, 1950. Pp. 
284-302. 

Gibbs, C. B. The continuous regulation of skilled 
response by kinesthetic feedback, Cambridge, Eng- 
land: Med. Res. Council, Appl. Psychol. Res. Unit, 
1953. (APU 190/33.) 


Summary report on 
College Park, Md.: 
(Proj. No. DA-49-007- 


J. A. Whittenburg, S. 


Ross, and T. G. Ross 


Helson, H. Design of equipment and optimal a 
man operation. Amer. J. Psychol, 1949, 62, 47 
497. ch 

Johnson, P. O. Statistical methods in veset 4 
New York: Prentice-Hall, 1949. Pp. 246-255. ie 

Jones, E. I, & Bilodeau, E. A. Differential tar Y 
of training between motor tasks of different i 
culty. USAF Hum. Resour. Res. Cent. Res. Buls 
1952, No. 52-35. 5 ir- 

Kellogg, W. N. The learning curve for flying an? 
plane. J. appl. Psychol., 1946, 30, 435-441. Fa- 

Lewis, D, McAllister, D. F, & Adams, J. A Tt 
cilitation and interference in performance oe a 
modified Mashburn apparatus: I. The effec xp. 
varying the amount of original learning. f 
Psychol., 1951, 41, 247-260. jnter- 

McAllister, D. F, Retroactive facilitation and Amen 
ference as a function of level of learning. 

J. Psychol., 1952, 65, 218-232. 

McGeoch, J. A., & Trion, A. L. 
human learning, New York: Longmans 
1952, imental 

Osgood, C. E. Method and theory in experi 5S 
psychology. New York: Oxford Univer. 
1953. 


of 
The psychology 


a  ——<——<<<« a 


Journal of Applied Psychology 
Vol. 43, No; A, 1989 O? 


PERSONALITY OF THE ROUTE SALESMAN IN A 
BASIC FOOD INDUSTRY ' 


DAVID A. RODGERS 


University of California, Berkeley 


While much attention has been given to the 
Selection of salesmen, relatively little has been 
Siven to their general personality character- 
istics (see Roe [1956] for a survey of rele- 
vant work). Most studies have used only one 
or two assessment instruments, usually ones 
highly focussed on the job situation itself. 

1S paper reports an intensive personality 
Study, using a variety of tests, of a carefully 
Selected sample of the route salesmen in a 
national wholesale basic food company. Com- 
Parisons were made between the salesmen’s 
Personalities as seen by themselves, by their 
Osses, and by a psychologist, and between 
: er personalities and their job requirements 
aS seen by themselves and their bosses. The 
Purpose of the study was to determine the 
Personality characteristics of a typical group 
Of route salesmen, 


Procedure 
Subjects. 


Nati Two selling units in a food company of 
ational sco 


ivisio pe were selected by the head of the sales 
eratio; n as being typical of the company selling op- 
of has The sales manager of each unit selected six 
is a Oute salesmen to represent a cross section oi 
as pe and rank ordered them from good to poor 
of ther Oyces. These 12 salesmen constitute the Ss 
Ucts iniy, They sell a large line of related prod- 
highly « retailers on regular routes. The work is 
decisio: competitive, It requires the Ss to make many 
o ane on their own about price and display and 
each hae to many situations. The sales manager of 
S u nit has almost complete job authority over the 
nder him, 


ole description inventory. It was desired to ob- 


A 
What Standardized and comparable descriptions of 
Persona. sales manager thought cach S was like (the 


to b » Or P), what the sales manager wanted each 
e like (the role demand, or RD), what each S 
1 
Ses ba data for this study and many of the analy- 
toral ae taken from the author’s unpublished doc- 
Role Woscttation, Personality Correlates of Successful 
thor fs avior, Univer. of Chicago, 1953. The au- 
Ment indebted to Nejelski and Co., Inc. Manage- 


D. L, @unsels, for opportunity to collect the data, to 
Vvaluahemmon, E. A. Haggard, and M. I. Stein for 
Stu e assistance in planning and executing the 


the w and to L, W., Porter for helpful comments on 
anuscript, 


2 


thought the sales manager wanted him to be like 
(the role concept, or RC), what each S thought he 
himself was like (the self concept, or SC), and what 
a psychologist thought each S was like on the basis 
of a battery of tests and interviews (the clinical pic- 
ture, or CP). For this purpose, a Q sort (Stephen- 
son, 1953), designated QS, was developed. QS con- 
sists of 108 statements of attitudes and behaviors 
expressive of 36 of Murray’s need-press variables 
(Murray, 1938). Murray’s variables were used as a 
basis for selecting the Q items to insure a wide rep- 
resentation of characteristics with a minimum of em- 
phasis on any one aspect of personality or behavior. 

The statements in QS were to be card sorted into 
a forced-frequency quasi-normal distribution from 
most characteristic or descriptive to least character- 
istic or descriptive of the person or role dimension 
being described. The distribution used was 1-6-18- 
29-29-18-6-1, slightly platykurtic, with gə equal to 
—0.29, 

Data collected. Each S was given the following 
tests: Wechsler-Bellevue, Form L; Rorschach; The- 
matic Apperception Test, 20 pictures for adult males; 
Group TAT, 3 selected pictures (Stouffer & Toby, 
1951); Draw-a-Person; Draw-a-Salesman; 50 sen- 
tence completion items selected from the Stein SCT; 
50 sentence completion items for salesmen, prepared 
for this study and focussed specifically on the selling 
situation; a structured interview dealing broadly with 
personal history and attitudes; and the RC and SC 
descriptions already mentioned, using QS. The CP 
descriptions of each S were based on this battery of 
tests minus RC and SC, which were not examined 
until after the CP descriptions had been completed. 
The analyses and CP descriptions were made by the 
author, using conventional principles of interpreta- 
tion, As previously indicated, P and RD descrip- 
tions were also obtained for each S, using QS. Reli- 
ability checks were provided by resortings of QS for 
RC and SC six months after the initial sortings and 
by resortings of QS for CP for cach S after 11 inter- 
vening CP sortings (Table 1). No reliability checks 
were made for P and RD. 


Analyses and Results 


For each of the variables CP, SC, P, RC, and RD, 
the 12 relevant QS descriptions were intercorrelated 
and factor analyzed, using Thurstone’s centroid 
method. The loadings on the first centroids are 
shown in Table 2, The obtained centroid factors 
for CP, SC, P, and RC were algebraically rotated 
(Rodgers, 1957) to produce maximum correlation 
with the sales managers’ rank orderings (success rat- 
ings) of the Ss. The rotations were made to deter- 


35 


David A. Rodgers 


Table 1 


Scores of Subjects on Variables Indicated 


Variable 


Subject 


1A 1B 2A 2B 3A 3B 


Boss Ranking for 
Good Employee 
Good Salesman 

Buddy Ranking for 
Good Employee 
Good Salesman 
Good Friend 


Age in Years 


Years in Sales in 
Present Company 

Years of School 

Wechsler-Bellevue 


Performance IQ 
Verbal IQ 
Total IQ 


Fisher Rigidity Index 


Sort-Resort r of 
CP Description 
SC Description 
RC Description 


S8 42 60 57 66 
oO i 22 s s 2 


1 1 2 9 3 4 4 5 5 6 ó o 
2 15 He 5 Ë ge ds as 2 85 G 55 i 
1 is 4 is as @ g 35 25 35 #5 5 T 
12 8 @ oF ¢@ g 3 25 4 5 5 a 
eS 5 M 1 As g 45 35 3 35 6 ‘ 
a 9 
oa 8 a s wm wy oy T 3142 d 
“3 2 m mw a y 6 18 5 0 3 rf 
Bow Mow w m G 2 11 4 n2 ë 83 i 
f 1 
37 i ne st ts iis ms 111 122 97 102 121 rr 
130 113 109 97 14449 109 108 113 111 109 109 18 
130 112 111 95 12 112 113 110 118 105 106 116 
ai = 38 
28 © se 8 we os 2 B w n B 
- > > 16 
477 74 64 g2 ag 66 74 9 5.7082 J 
T S9 58 30: m ag 3 a 2 3 a 


Groups A and B were ranked separately. 
Table 2 ‘ 
Loadings on Indicated Factors 
me. s aee ae ee 
= re = Factor as 
_ Ist Centroid Rotated 
Subject# GP SC a r RAWUE ao Re TR 
= >e Ë RE Boa RD? CP sc 7 = 
1A 0 72 56 a 5 "i ao eC [ 
sia i 58 3 7 
1B 6 33 BS ai 48 ms io — 02 
2A 1 R a a » ” z wo s P 
2B TA S 53 1.00 ao : = A 
be 79 pe E 5 pr i 4 12 “od 06 
3B BO 8 4 g 1.00 = ao as % 1 
4A Jl 52 —.26 .68 68 23 — 08 —.28 10 
iR S454 36 o ös —19 o -ie T 
5A -54 Sd 228 40 49 30 45 43-38 
5B .60 7740 74 99 2% 39 —14 06 
6A 68 57 ~ 37 53 sy = 21 25 =a er 
6B 08 65 ~19 78 99 -28 ee ee 
= = - dB 
ee 1 through 6 indicate success rankings assigned by bosses, 1A and 1B being best employees in Groups A? y 
RD descriptions factored separately for Groups A and B, 


Personality of the Route Salesman 


mine whether there were factor dimensions that dis- 
tinguished successful salesmen from unsuccessful ones. 
Obtained loadings are shown in Table 2. Relatively 
Satisfactory discrimination was obtained for CP and 
P, correlations between the rotated-factor loadings 
and the success ratings being .61 and .92 respectively. 
Satisfactory separations were not obtained for SC 
or RC, correlations between the rotated-factor load- 
Ings and the success ratings being .17 and .19, re- 
Spectively. The failure to achieve satisfactory sepa- 
ration for SC and RC is of interest and should be 
noted. The factor rotation method maximizes chance 
as Well as nonchance relationships between the suc- 
cess ratings and the several (four for SC and three 
for RC) sets of factor loadings making up the or- 
t Ogonal reference frames. Failure to achieve ap- 
Preciable correspondence between the rotated-factor 
toadings and the success ratings, in spite of maximiz- 
ng such chance similarities, can be taken as convinc- 
8 evidence that little nonchance relationship exists. 
he factor saturations of the QS items were com- 
Puted for the CP, SC, RC, and RD first centroids 
and for the CP and P factors correlating highest 
with the success ratings. Spearman’s formula for 
p eighting (Spearman, 1927, Appendix XX) was used 
g Computing the saturations, Since the factor load- 
dene the P first centroid correlated .96 with the 
for a on the rotated P factor, item saturations 
Fi e P first centroid were not computed. They 
ee be almost identical with those for the rotated 
Pe the saturations, the items most characteristic 
min ae characteristic of each factor were deter- 
Som ~~ According to these saturations, the boss 
min, ye good salesman as success-oriented, deter 
sales, » and rather dominant, in contrast to the poor 
tractibhe who is confused, complaining, and dis- 
tends e. Each of the salesmen, good and poor alike, 
S to see himself as success-oriented, sociable, co- 
energetic, and confident without being 
e CP description is somewhat at variance 
of these views. In terms of the tests, the 
all appear to be highly conforming, ma- 
attention-desiring individuals who have 
lized standards of conduct and little abil- 
ships. form or maintain close interpersonal relation- 
> In terms of the tests, the successful salesmen 
Ominant, vigorous, controlled, and self- 
than are the less successful ones, these dif- 
eing similar to those seen by the bosses. 
SUecess Ss, the job expectations are that they be 
“oriented, determined, persistent, and rather 


Letialistic, 
‘ew interna’ 
ity t 


2 
ite ables A to F, listing the most characteristic 
the į sae the least characteristic items for each of 
Chant factors, and Table G, summarizing the 
Scoring for each S according to Beck’s 
Mentatio, Ve been deposited with the American Docu- 

ton Institute. Order Document No. 5887, 
Auxiliary Publications Project, Photo- 
ervice, Library of Congress, Washing- 

$1.25 fop » emitting $1.25 for 35 mm. microfilm 
aYable pOT y 8 in. photocopies. Make checks 


to Chi ner A F P 
Ngress, ief, Photoduplication Service, Library 


i) 
we 
~ 


Table 3 
Distributions of Fisher's Index of Rigidity Scores 


Sales- 
men in Fisher’s Data 
Present 
Measure Study Normals Hysterics Paranoids 
Mean 46.2 24.9 44.3 44.1 
SD 14.4 11.8 17.2 13.6 
N 12 20 20 20 


dominant individuals and the salesmen have a fairly 
accurate understanding of these expectations. 

Fisher’s Index of Rigidity scores (Fisher: 1948, 
1950) were computed from the Rorschach protocols 
(see Footnote 2) and are shown in Table 3. A sum- 
mary of the relationships of various variables to the 
success ratings is shown in Table 1. 


Discussion 


The first centroid factor loadings for all 
variables except P are consistently high and 
positive (Table 2), indicating that the Ss are 
a homogeneous group with similar personali- 
ties, as might be expected of people in similar 
jobs. In spite of such similarity, however, 
the good salesmen are seen by their bosses as 
being quite different from the poor salesmen, 
the P first centroid being bipolar (Table 2) 
and the loadings on it correlating .86 with 
the success ratings, by the rank difference 
method. Although much alike in other re- 
spects, the Ss differ in terms of their job 
abilities and it is primarily in terms of these 
differences that they are seen by their bosses. 

As already noted, no significant differences 
related to job success were found in the SC 
and RC descriptions of this group. It might 
be possible to find such differences between a 
group like this and a group of Ss that were 
totally unsuited for the job, since all of the 
Ss in the present study were at least suffi- 
ciently competent to keep their jobs. The re- 
sults do suggest, however, that self and role 
descriptions on broadly based personality in- 
ventories may have much more limited utility 
for employee selection than do either descrip- 
tions on inventories narrowly focussed on the 
specific job requirements or other kinds of 
evaluations such as interviews by supervisors, 
who may be highly sensitized to those char- 
acteristics that are important for the par- 
ticular job. 


238 David A, 

Too few Ss are involved in the study to 
give clear-cut relationships between the suc- 
cess ratings and the variables of Table 1. 
There is some indication that age and length 
of time in sales with the company may be 
related to job success. As might be expected, 
the boss rankings for selling ability correspond 
closely to the success ratings. The buddy 
rankings for selling ability and for good em- 
ployee are positively but not perfectly corre- 
lated with the boss rankings, indicating that 
a person’s abilities as seen by his colleagues 
are not necessarily the same as those seen by 
his boss. 

According to Fisher’s Index of Rigidity, the 
Ss have markedly rigid Personality structures 
(Table 3). All of the tests reveal what 
might be called Personality impoverishment. 
However, much of the “impoverishment” 
seems well suited to help the salesman in his 
job. This is especially true of certain char- 
acteristics that the author feels are highly 
descriptive of all of the Ss studied. These 
characteristics were inferred from a general 
evaluation of the test and interview mate- 
rials and from observation of the salesmen on 
their routes. The characteristics are: 

1. Dependence on other people’s opinions 
and absence of own opinions. The Ss seemed 
to have few strongly-held opinions of their 
own. In Riesman’s terms, they were other- 
, 1950) to the extreme, 
them adjust easily to a 


s to lean on, how- 
rtable and at loose 


2. Cathection of the tangible and Jack of 
interest in intangible values, Being interested 
in possessing material things themselves, they 
seemed better able to convince other people 
of the value of possessing the commodities 
they were selling. They also seemed less jn- 
hibited by concepts of right and wrong or of 
proper and improper, and were correspond- 
ingly more willing to do what was pragmati- 


cally necessary to make a sale. They were 


Rodgers 


zi 
willing to put up with more than ore 
personal discomfort and inconvenience in 
der to attain monetary gain. , pasic 

3. Superficiality of relationships and one 
distrust of people. The Ss were much hi 
cerned about maintaining good sapaian ie 
lationships but seemed quite unlikely an nts 
able to form close or permanent atte 
to people. They seemed basically to dis da 
people but to conceal the distrust bebra 
facade of congeniality. This eerie a 
seemed to help them develop great er 
appearing to be friendly and helpful © other 
really becoming concerned about the 
person’s welfare. They could then ski duct 
manipulate relationships to sell their Pr oni 
without worrying about whether the ae was 
really needed the product or wae the 
the best the customer could obtain he es- 
price. Such an attitude is perhaps erson 
sence of Fromm’s (1947) marketing Pe" 
ality. ; un 

Such analyses suggest that a certain E 
of “psychopathology,” provided it 1 essen” 
right sort, may be beneficial or bari + as Í 
tial in some jobs, rather than harmfu 
often supposed, 


mary 
Summary cted 1 


A group of 12 route salesmen selerorestle 
Tepresent a cross section of eS asic 
selling force of a large company in ive bat: 
food industry were given an exten inice i 
tery of tests. From these, standard “c are 
descriptions of the salesmen were preP peit 
The salesmen described themselves “escribe? 
job requirements, and their bosses de yar” 
them and the job requirements. -sobs af 
ous descriptions of the Ss and their 3 rank” 
compared and are related to the bases zi 
ings of the Ss as “good” to “pona com 
ployees. The personality characterist 2 if 
mon to all of the salesmen and pe the 
ferentiating the more successful ee yy 
less successful are identified. The qractey 
which the salesmen’s personality “is asset 
istics adapt them for their job is 


Received September 26, 1958. 


References 


Fisher, S. Patterns of personality neu 
of their determinants, Unpublished ¢ 
sertation, Univer. of Chicago, 1948. 


e 
and ia 
ctoral 


Personality of the Route Salesman 239 


Fisher, S. Patterns of personality rigidity and some 
of their determinants. Psychol. Monogr., 1950, 64, 
No. 1 (Whole No. 307). 

Fromm, E. Man for himself. New York: Rinehart, 
1947, 

Murray, H. A. Explorations in personality. New 
York: Oxford Univer. Press, 1938. 

Riesman, D. The lonely crowd. New Haven: Yale 
Univer, Press, 1950. 

Rodgers, D, A. A fast approximate algebraic factor 
Totation method to maximize agreement between 


loadings and predetermined weights. Psychometrika, 
1957, 22, 199-205. 

Roe, A. The psychology of occupations. New York: 
Wiley, 1956. 

Spearman, C. The abilities of man. 
Macmillan, 1927. 

Stephenson, W. A study of behavior: Q technique 
and its methodology. Chicago: Univer. Chicago 
Press, 1953. 

Stouffer, S. A, & Toby, J. Role conflict and per- 
sonality. Amer. J. Sociol., 1951, 56, 395-406. 


New York: 


Journal oj Applied Psychology 
Vol. 43, No. 4, 1959 


COMPARISON OF TWO STYLES OF LEADERSHIP 
IN SMALL GROUP DISCUSSION ' 


RICHARD H. PAGE axo ELLIOTT McGINNIES 


University of Maryland 


Although discussion group leadership may 
vary in many ways, a common distinction in- 
volves what is variously termed as “demo- 
cratic,” “non-directive,” “group-centered,” or 
“permissive” leadership as opposed to an 
“autocratic,” “directive,” or “leader-centered” 
style. Despite a vast amount of research in 
the area of leadership, it is not entirely clear 
just how directive a discussion leader should 
be for maximum effectiveness. In fact, it is 
apparent from several studies (Leavitt, 1951; 
Shaw, 1955) that a distinction must be made 
between efficiency of group performance and 
satisfaction of the members with the group, 
since the two are not always compatible. 

The weight of research findings dealing with 
leadership style seems to favor the democratic 
or group-centered approach over the authori- 
tarian or leader-centered variety, In the edu- 
cational setting, for example, a number of 
studies (Anderson & Brewer, 1954; Flanders, 
1951; Robbins, 1952) suggest that less domi- 
nating, student-supporting teacher behavior 
produces fewer hostile and aggressive re- 
sponses and more integrative group behavior 
than does more directive leadership. That 
group-centered leadership tends to result in 
greater “social-emotional growth” and insight 
is evidenced in several studies (Asch, 1951: 
Bovard, 1952; Faw, 1949; Gross, 1948). It 
has also been shown that the group-centered 
approach is superior in altering the percep- 
tions of members in the direction of a group 
norm (Bovard, 1951b), in producing greater 
communication of feeling among members 
(Bovard, 1952) and in stimulating group in- 
teraction (Bovard, 1951a). Other studies 
(Hare, 1953; Preston & Heintz, 1949) have 
indicated that “participatory” leadership pro- 


1 This research was supported by a special grant 
from the National Institute of Mental Health, United 
States Public Health Service, Department of Health, 
Education, and Welfare. Ss were obtained through 
the cooperation of the American Association of Uni- 
versity Women. Donald K. Pumroy served as dis- 
cussion leader for the groups. 


duces more change in privately held opinions 
and greater satisfaction with group products 
than does “supervisory” leadership. There 3 
also evidence (Guthe, 1945; Levine & pauer 
1952; Radke & Klisurich, 1947) that group 
discussion is more effective than lectures : 
changing the opinions of group memben 
Many of the advantages claimed for T 
Sroup-centered style of leadership can sae 
ably be attributed to the facilitating effect A 
this approach upon member participatio 
Group satisfaction and productivity Seem a 
depend, however, upon the “quality” as 208) 
as the “quantity” of participation (ovner i 
Hutt, & Guetzkow, 1950), and in this resl js- 
it is significant to note that democratic 
cussion leadership also seems to improve 
quality of group decisions (Maier & So 
1952). 

Evidence for the superiority of a MO 


to 


Jem, 


re di- 


n i in som 
rective or formal leadership role Member 
group situations is also available. ; over?” 
satisfaction with decision-making in 80V%® 


i 
n be pos 
ment and industry has been found to contro 


Berko” 


superior to group discussion in clas? hj- 
teaching (Asch, 1951; Husband, 1949): ited 
rective leadership in therapy groups, es jon 
in more therapeutic gains, higher satis’? tivé 
and better attendance than did a nondi jege 
approach (Evans, 1950). Finally, fiscus” 
students have been reported to prefer to 4 
sion classes which were directively Je the 


more permissive discussion, with OP sx- 


n 
poorer students performing better (aise? 
aminations under directive leadership 

1951). aut? 


In a study rather closely related 11955) 
pose to the present one, Wischmeier collet’ 
observed the discussion behavior O nd 
students exposed to both leader-center™ arat? 
group-centered leadership at two 5 


240 


Leadership in Small Group Discussion 


meetings. He found that over 75% of the 
Sroup members were aware of the differences 
between the two leadership roles, and that all 
Stoups ranked the leader-centered role higher 
on the value of the leader's contributions. 
However, the Ss rated the group-centered dis- 
cussions higher on such items as degree of 
Personal involvement, warmth and friendli- 
ness of atmosphere, ease with which contribu- 
tions could be made, and cooperativeness of 
the discussion atmosphere. In discussing 
these results, Wischmeier states, “This would 
Suggest that the group-centered leader is not 
ikely to receive much appreciation or recog- 
nition for his leadership services, although he 
is likely to be more ‘successful’ (in terms of 
involvement, cooperativeness, etc.) in leading 
is group.” 

We have attempted to discover whether two 
sty les of discussion leadership would be dif- 
ie perceived by adult members of 
all discussion groups and to determine 
mething of the nature of these perceptions. 

ae two styles may be described broadly as 
mom or leader-centered, and nondirec- 

©, Or group-centered. We asked the fol- 
Owing questions: 


1. What attributes are assigned by mem- 
es of discussion groups to a leader playing 
'rective and nondirective roles? 

wae w ill the same discussion leader be per- 
A ed more favorably in a directive or in a 
ndirective role? 
tien judgments about directive and non- 
ici iye leadership related to amount of par- 
ta Pation in discussion by those doing the 
ating? 
n gl group opinions about the value of 
bach 'scussion be determined by the type of 
ission leadership experienced? 


Procedure 
Subj, > ra 
fem tbjects, Six groups ranging in size from 6 to 16 


for a members were scheduled to meet once each 
4 ee Purpose of viewing a mental health film, 
S-min celing of Hostility,” and holding a subsequent 
om, ute discussion. The groups were assigned ran- 
ership’ three each, to directive and nondirective lead- 
fro po ditions, Biographical information obtained 
two c s Participants showed the groups under the 
Average tions to be comparable in most respects. 
Ned, ae age of all Ss was 39 years, 80% were mar- 

Ver 90% had completed four or more years of 


241 


college, and family incomes averaged better than 
$7000 annually. A total of 65 women, representing 
upper-level groups both educationally and economi- 
cally, constituted our experimental population. Since 
these are fairly typical of the types of individuals 
who are active in educational programs concerned 
with mental health, it was felt that the selective na- 
ture of the sample was appropriate for the present 
study. 

Discussion leadership. A colleague, trained in role- 
playing procedures, served as discussion leader for 
all of the groups. In three of the discussion sessions 
he assumed what we have broadly termed a “direc- 
tive” role, while in the remaining three he adopted 
a relatively “nondirective’ approach. Briefly de- 
fined, the directive role required that the discussion 
leader serve as a professional “expert” for the group, 
interpreting and explaining points that were made 
in the film, responding directly to questions from 
the group, and venturing his own opinions whenever 
an appropriate occasion arose. In the nondirective 
situation, the leader refrained from interpreting the 
film, reflected questions and comments from indi- 
viduals back to the group, and limited expression of 
his own viewpoint as much as possible. The ef- 
fectiveness with which these contrasting roles were 
played was determined from examination of ques- 
tionnaires completed by the Ss at the conclusion of 
the discussions and will be described later, 

Experimental conditions. Sessions were held un- 
der informal conditions in the homes of the par- 
ticipants, and, with their knowledge and permission, 
the discussions were tape-recorded. Although the Ss 
were allowed to remain anonymous, identification by 
symbols of each discussion participant with his com- 
ments was accomplished by a procedure described in 
detail elsewhere (McGinnies, 1956). At the begin- 
ning of each meeting, all Ss filled out a short bio- 
graphical inventory. They then viewed the film and 
held a discussion of it. Following the discussion, the 
group members answered a questionnaire designed to 
obtain their opinions of the film, the discussion, the 
discussion leader, and the value of the total experi- 
ence. 

Evaluation techniques. The data from which the 
hypotheses of the experiment were to be evaluated 
were obtained from the questionnaire administered 
after the discussion. The first part of this form con- 
sisted of a list of 20 pairs of polar adjectives, each 
pair defining the limits of a 13-point rating scale. 
The Ss were asked to place a checkmark against 
that position of each scale which they felt best char- 
acterized the discussion leader’s behavior, In 19 of 
the 20 adjective-pairs, one of the terms was clearly 
more complimentary or favorable to the leader than 
the other, the direction of favorable response having 
been determined through pretesting with a group of 
college students. One of the adjective-pairs was 
ambiguous but was included because it seemed rele- 
vant to the problem. The adjectives selected were 
based upon a list devised by Molnar (1955) for 
evaluating discussion leadership. 

Following administration of the adjective check- 


242 


list, the group members responded to 14 questions 
designed to assess their perceptions of the film, the 
discussion, the discussion leadership, and the over- 
all proceedings. , 

Measures of leader behavior. In order to validate 
the actual differences in the two roles assumed by 
the discussion leader, the proportion of discussion 
time consumed by the leader under the two condi- 
tions was calculated for each of three equal divi- 
sions of the discussion period. From Table 1, it is 
clear that the leader used more discussion time when 
in a directive position, although the group members 
observed under this condition tended to consume 
more time after the first third of the discussion. 
Further analysis of the leader’s behavior showed 
that his greater dominance of the discussion in the 
directive role resulted from his making longer rather 
than more frequent comments. 

Inspection of the leader’s statements suggested that 
they might be categorized meaningfully to reflect the 
directive or nondirective roles and, hence, to further 
validate the effectiveness with which these two con- 
ditions were established. Table 2 contains the break- 
downs of the leader's comments under the two ex- 
perimental conditions according to these categories. 

It is apparent from this table that in the directive 
role, the discussion leader expressed a much greater 
proportion of opinions or interpretive statements and 
related information somewhat more frequently than 
in the nondirective role. In the nondirective setting, 
the leader engaged in considerably more reflecting or 
rephrasing of group comments, agreed more with re- 
marks made by group members, and directed more 
questions at the group. The coding of leader com- 
ments according to the seven categories was done in- 
dependently by two judges, with a reliability coeffi- 
cient of .90. 

It may be concluded from the data in Tables 1 and 
2 that the discussion leader did, in fact, adopt two 
discriminable roles in handling the different groups, 


Table 1 


Proportion of Discussion Time Consumed by Leader 


Time Periods 


SS Total 
Groups* I i HI pissi 
ND (N = 13) 09 07 13 .10 
ND (N = 10) 32 413 2 23 
ND (N = 8) 16 14 24 18 
WD 19 11 20 Bi 
D (N = 13) 66 64 62 64 
D (VN =6) 54 38 35 42 
D (N = 16) 69 43 47 .53 
D 63 49 48 53 


®The“symbol ND means nondirective, while D indicates 
directive}leadership. 


Richard H. Page and Elliott McGinnies 


Table 2 


Percentage of Leader Behavior Occurring in 
Various Descriptive Categories 


ND D 
ee 
Opinion or Interpretation 07 53 
Reflecting and Rephrasing 21 a 
Relating Information 12 a 
Questions 15 02 
Agreements 43 15 
Disagreements 00 2, 
Humor 02 02 
— 


and that our initial definitions of the directive b 
nondirective approaches were effectively impleme 
in the experimental situations. 

Discussants’ perceptions of the leader. I 
score responses to the adjective checklist, O" 


n order t° 
dinal 


ss eal 
numbers were assigned to the 13 positions A indi- 
of the 20 scales. In every case, the value while 


cated the more “favorable” end of the scale, Whe 
a score of 1 marked the less favorable extreme. 
grand mean for each adjective scale was calcu pur 
and was used to dichotomize that variable we two 
Poses of comparing score distributions under k n 5S 
leadership conditions. In other words, 2 giv elow 
rating was classified as falling either above oF 
the mean rating of all Ss for that item. 
were above the grand item means on more t 
of the adjective scales were classified as 
vorable” in their over-all evaluation of t! or 
while Ss scoring below the grand means On 
more than half of the scales were classified ^ j- 
favorable” in their evaluations, There was % 
form tendency in all of the groups for the 162C" ted 
be favorably rated, so that the differences i" 
the relative extent to which the leader was 
by Ss under the two conditions. on lead? 
When playing a directive role, the discussio” em- 
received more favorable ratings by 22 gfOUP. nts. 


Jated 


«ipa 
bers and less favorable ratings by 13 eat wee 
In the nondirective role, on the other bent Jess m 


rated more favorably by 8 individuals a” n teste 
vorably by 22 persons, These differences ae s 
by chi square were significant at the 01 nerefot 
experiencing directive discussion leadership, t o ert 
Were significantly more favorable in thei" ` g, yt 
evaluation of the discussion leader than wet? 
der nondirective leadership. «a Jeadet i 
Of the 20 adjective pairs, the directive no” 
ceived a more favorable rating in 13, while tar si 
directive leader received more complimen? sre con 
Praisal in only 7. Contingency tables wend ue 
structed for each of the adjective pails: types e 
adjectives favorable to each of the tw? wee 
leadership are listed in Table 3 in order of ae a p 
of the chi squares that were obtained- The dee 
tives that discriminated between the tw? 


Leadership in Small Group Discussion 


to 
pN 
Ga 


Table 3 


Descriptive Adjectives Ranked According to Their Discriminative Power 


Directive Leadership 


Nondirective Leadership 


Rank Adjective Rank Adjective 
1 **interesting (uninteresting) 1 ‘permissive (restrictive) 
2 ‘frank (evasive) 2 open-minded (opinionated) 
3 “*satisfying (disappointing) 3 reserved (forward) 
4 *purposeful (aimless) 4 cautious (rash) 
5 *enlightening (unenlightening) 5 reasonable (stubborn) 
6 “industrious (lazy) 6 practical (idealistic) 
7 “persuasive (unconvincing) 7 modest (arrogant) 
8 penetrating (superficial) 
9 helpful (hindering) 
10 effective (ineffective) 
11 friendly (unfriendly) 
12 tactful (tactless) 
z 13 considerate (selfish) 
conditions at the .01 and .05 significance levels are participators tended to rate the leader less favorably 


inal 
Po by a double or single asterisk respectively. 
our E of the leader and participation. Since 
sion eae of sequence of comments by the discus- 
dividual es members enabled us to identify each in- 

ears With his remarks, we were able to dichoto- 
raters fe leader ratings by the extent to which the 
Under ad participated in discussion. When the Ss 
Cordin each leadership condition were classified ac- 
median to whether they were above or below the 
Comme: a those groups with respect to number of 
and at (high and low participators, respectively) 
or bel ether their ratings of the leader were above 
adjective the grand item means on over half of the 

u i scales, the contingencies shown in Table 4 
While 


cant the results shown in Table 4 are not signifi- 


sidere en the two experimental conditions are con- 
Partici Separately, analysis of ratings by the low 

Ore Pators only showed them to be significantly 
than Satisfied with directive discussion leadership 
With nondirective leadership ( < .01). High 


under both leadership conditions. 

Evaluation of the discussions. Responses to the 
14 items which followed the adjective checklist were 
uniformly favorable, and for only one item was 
there a significant difference between the directive 
and nondirective groups. When responses to the 
question, “To what extent was the leader responsible 
for important contributions to the discussion?” were 
dichotomized into “very much” and “somewhat or 
little,” the directively led discussions received a sig- 
nificantly (p < .05) greater proportion of ratings in 
the “very much” category than did the nondirectively 
led groups. 

This result provides further evidence that the group 
members were aware of certain salient features of 
both directive and nondirective leadership as these 
roles are ordinarily defined. The remaining 13 items, 
on which no significant differences were obtained, 
dealt with such matters as the adequacy of the film, 
the extent to which important issues were raised in 
the discussion, how relaxed the S was during the dis- 


Table 4 


Frequency of More Favorable and Less Favorable Evaluations of the Leader for High and Low 


Participators Under Each Leadership Condition 


——— 
Directive Nondirective 
More Less More Less we 
Favorable Favorable Favorable Favorable 
Hig x mai a aiiin eee 
Sh Participators 8 9 5 10 
“ON om 
Y Participators 13 4 3 12 
s p > 0S v= 7 b> 05 


244 


cussion, and the satisfaction of the S with the dis- 
cussion. It is perhaps worth noting that 6 of the 31 
Ss under directive leadership expressed dissatisfaction 
with the discussion, while none of the nondirective 
group members indicated disapproval. , 
Since the film dealt with a topic of common in- 
terest, and the experimenters were, in a sense, pro- 
viding an educational program for the participants 
in the experiment, it is not surprising that generally 
favorable reactions to the discussion and the discus- 
sion leader were obtained regardless of the permis- 
siveness of the leader’s role. 


Discussion 


The results of the experiment indicate that 
a directive approach by a discussion leader is 
favored by members of sophisticated adult 
discussion groups. While this preference is 
not revealed in judgments of the participants 
about the discussions, it is reflected in the de- 
gree to which they assign favorable ratings to 
the discussion leader. Examination of the 
group members’ opinions of the leader ac- 
cording to the extent of participation of each 
member in the discussion suggests an explana- 
tion for the over-all group judgment. The 
more favorable opinions of the directive dis- 
cussion leader appear to have come princi- 
pally from those Ss who were classified as 
“low” participators. Inspection of Table 4 
shows that the high participators were about 
equally divided in the extent to which they 
rendered favorable judgments of the leader, 
while the low participators account for the 
significantly more favorable evaluation of 
leadership elicited under directive conditions. 
The distinction between high and low par- 
ticipation in accounting for member satisfac- 
tions and the effects of leadership style has 
also proved to be important in a number of 
other studies (Deignan, 1956; Porter, 1955). 
One might conjecture that the less aggressive 
members of a discussion group are more de- 
pendent upon leadership and, therefore, prefer 
a leader who answers this need. The fact 
that even the more active participants were 
almost 2 to 1 in the direction of less favorable 
response to the leader’s playing a nondirec- 
tive role, however, confirms the finding that 
directive leadership is rated more favorably 
by discussion group members in general. ` 

We found no confirmation for previous re- 
ports that the nondirective, or sroup-centered, 
leader is likely to be more successful in lead- 


Richard H. Page and Elliott McGinnies 


ing his group, when success is defined 1, 
terms of group reaction to the discus ti 
Our Ss were equally well satisfied with ; 
discussions under both leadership meee 
but we are unable to say, of course, whe n 
this would be true with different types 
groups discussing different topics. The 
also the possibility that the personal ana 
teristics of the discussion leader contin 
to the present over-all pattern of favora 
response. In cases where the leader’s pre or 
ences and skills favor either the Giren 
the nondirective role, different results m8 i 
be expected. Since our data indicate piar 
leader played both roles effectively, this ies” 
tingency probably does not apply to the P 
ent findings. 


Summary 


‘ewed and 

Three small groups of adult Ss view direc- 
discussed a motion picture film under ; 
tive discussion leadership, while three 
tional groups followed the same piorna 
under nondirective leadership. eg 
discussion, the Ss rated the leader in efine 
of 20 adjective pairs, each of which con- 
favorable and unfavorable ends of o rela 
tinuum. They also answered questions 
tive to the value of the discussion. direc 

In the groups in which he played scant! 
tive role, the leader received sign ups 
more favorable ratings than in those roach: 
where he employed a nondirective oer niy 
The directive leader was rated as signi irpos™” 
more interesting, frank, satisfying. Pr iasive 
ful, enlightening, industrious, and aie the 
and significantly less permissive. thé 
nondirective leader. sified © 

When the group members were seat that 
high or low participators, it was “ene ef 
the low participators were distinctly ” ë ad 
vorable to directive than to nondirect re 
ership. The high participators did the 
act in significantly different ae als 
two leadership conditions, although t the 
tended to be less favorably dispose Fi 
nondirective leader. suai wer 

Judgments about the discussion its ips an’ 
uniformly favorable in all of the gror ip im 
were not related to the type of leaders 
posed. 


dure 


Received September 24, 1958. 


Faw, y, AL 


Gross, 16m 


Leadership in Small Group Discussion 245 


References 


Anderson, H, H., & Brewer, J. Studies of teachers" 
classroom personalities: I. Appl. Psychol. Monogr., 
1945, No. 6. 

Asch, M, J. Nondirective teaching in psychology: 
An experimental study. Psychol. Monogr., 1951, 
No. 4 (Whole No. 321). 
erkowitz, L, Sharing leadership in small, decision- 
making groups. J. abnorm. soc. Psychol., 1953, 
48, 231-238, 

Bovard, E, W. Jr. Experimental production of in- 
terpersonal affect. J. abnorm. soc. Psychol., 1951, 
46, 521-528, 

Ovard, E, W., Jr. Group structure and perception. 
« abnorm., soc, Psychol., 1951, 46, 398-405. 

Ovard, E. W., Jr. Clinical insight as a function of 

Sroup process, J. abnorm. soc. Psychol., 1952, 47, 

534-539, 

egnan, F, J, A comparison of effectiveness of two 

group discussion methods. Dissertation Abstr., 

1956, 16, 1110-1111. 

he J. T. Some measurements of interaction in 

fete therapy groups. Unpublished doctoral disser- 
tion, Harvard Univer., 1950. 

A psychotherapeutic method of teaching 
Psychology, 4 mer. Psychologist, 1949, 4, 104. 
bstract) 
landers, N, A, 

in ex 

1951 


Personal-social anxiety as a factor 


th An experimental study of the validity of 
e nondirective method of teaching. J. Psychol, 
AB; 26, 243-248. 

fo c C. E, (Chm.) Manual for the study of 

pa habits; report of the Committee on Food 

abits, Bull. Nat. Res. Council, Wash, D. C., 

% No. 111. 


Hare, A. P. Small Group P wit partici- 
patory and supervisory leadership. J. abnorm. 
soc. Psychol., 1953, 48, 273-275. 

Husband, R. A statistical comparison of the efficacy 
of large lecture versus smaller recitation sections 
upon the achievement in general psychology. Amer. 
Psychologist, 1949, 4, 216. (Abstract) 

Leavitt, H. J. Some effects of certain communica- 
tion patterns on group performance. J. abnorm. 
soc. Psychol., 1951, 46, 38-50. 

Levine, J., & Butler, J. Lecture vs. group decision 
in changing behavior. J. appl. Psychol., 1952, 36, 
29-33. 

McGinnies, E. A method for matching anonymous 
questionnaire data with group discussion material. 
J. abnorm. soc. Psychol., 1956, 52, 139-140. 

Maier, N. R. F., & Solem, A. R. The contribution 
of a discussion leader to the quality of group 
thinking. Hum. Relat., 1952, 5, 277-288. 

Molnar, A. The effects of styles, speakers, and argu- 
ments upon the attitudes and perceptions of a 
listening audience. Unpublished master’s thesis, 
Univer. Maryland, 1955. 

Porter, R. M. Relationship of participation to satis- 
faction in small group discussions. Dissertation 
Abstr, 1955, 15, 2492-2493. 

Preston, M. G., & Heintz, R. K. Effects of participa- 
tory vs. supervisory leadership on group judgment. 
J. abnorm. soc. Psychol., 1949, 44, 345-355. 

Radke, M., & Klisurich, D. Experiments in chang- 
ing food habits. J. Amer. dietetics Ass., 1947, 23, 
403-409. 

Robbins, F. G. The impact of social climates upon 
a college class. Sch. Rev., 1952, 60, 275-284. 

Shaw, M. E. A comparison of two types of leader- 
ship in various communication nets. J. abnorm. 
soc. Psychol., 1955, 50, 127-134, 

Wischmeier, R. R. Group-centered and leader-cen- 
tered leadership: An experimental study. Speech 
Monogr., 1955, 22, 43-48. 

Wispe, L. G. Evaluating section teaching methods 
in the introductory course. J, educ. Res., 1951, 
45, 161-186. 


rnal of Applied Psychology 
pi 43, No. 4, 1959 


A STUDY OF THE VALIDITY OF THE SALES 
COMPREHENSION TEST AND SALES MOTI- 
VATION INVENTORY IN DIFFERENTIAT- 
ING HIGH AND LOW PRODUCTION IN 
LIFE INSURANCE SELLING 


LESTER E. MURRAY 
Murray Placement & Counseling Services, Omaha, Nebraska 
axb MARTIN M. BRUCE 


Clark, Channel, Inc., Stamford, Connecticut 


Predicting success in selling continues to be 
a prime concern of many applied psycholo- 
gists in industry. Considerable time, effort, 
and money are spent by business and indus- 
try in an effort to improve procedures and 
techniques in selecting competent sales per- 
sonnel (Super, 1949). This is especially true 
in the life insurance sales field which has been 
supporting a group of psychologists to work 
on this and related problems. 

Husband (1949) provided an excellent re- 
view of the literature on the selection of sales 
personnel up to that date, and more recently 
Austin (1954) reviewed research pertaining to 
sales personnel selection. Basically, published 
research indicates that there is considerable 
room for improvement in tools used for sales 
personnel selection. 

A recent research report (Kennedy, 1958) 
tends to negate an idea developing for some 
time: “General tests” of sales potential have 
little or no value; specific tests for specific 
sales situations are required. The two instru- 
ments used in the present study are “general” 
rather than specific. 

In 1953 Bruce published two instruments 
designed to aid in selection of sales Personnel, 
the Sales Comprehension Test and the Sales 
Motivation Inventory. A validation article on 
the first instrument appeared in this journal 
the following year (Bruce, 1954a). 

The purpose of the present investigation is 
to determine if a relationship exists between 
these two instruments and achievement in life 
insurance selling. 

Procedure 
Tests 

The Sales Motivation Inventory (Bruce, 1954b) is 

a 75-item multiple choice preference form. It cor- 


ielates relatively highly with the Sales g with 
Strong Vocational Interest Blank for Men AT Recor : 
the persuasive score of the Kuder Breer ave ap- 
Vocational. Supplementary normative data 6f; Mur- 
peared in the literature (Bruce: 1956c; 195 dies. 
ray & Bruce, 1957b), but not validation Sm 
The Sales Comprehension Test (Bruce: 
1957a) is a derivative of the test of A 
(Principles of Selling) Form A (Bruce, 195 ‘cons 
1953). It contains 30 multiple choice items 
ing of concepts and sales situations. plica 
research studies and supplementary data ar 
have appeared in the literature relative tO è 
Comprehension Test (Bass, 1957; Bruce: 
1956b; 1956c; 1956d; 1956g; 1957b; Bruce 
sen, 1956; Hecht & Bruce: 1957; 1958; t 
Bruce, 1957a) and the precursor instrumen! s 
1954b; Gray & Rosen, 1956; Harless & Bruc®è 
Speer, 1957), 


Subjects 


ary Ble 
The Ss of the investigation were 60 ordinal’ gng 
insurance salesmen chosen at random as in the 
volunteers in companies licensed to opera! mize al 
state of Nebraska. This group was dichoto ul” and 
the $400,000 production mark into caer 7 
“unsuccessful” groups, placing 39 men 1n : resent a 
group and 21 in the latter. These men rep’ 52.0 
life insurance companies. The territories he 
these men were in Omaha and Lincoln, Whi ð 
mainder covered small towns. No industria sa ye 
life insurance salesmen were included in pa eting 
All Ss worked full time in life iguan one 
They had very limited or no supervisory f exp” 
bility. Each man had at least one year ° r 
ence. iterion Ii 
The range of insurance sold in the ct an # 
was from $168,000 to $2,479,000. The MOU egrs h 
the “successful” group was 33.8, while 4? pend 
the mean age of the “unsuccessful” grouP: t 
of life insurance sales experience averages ie J ie 
for the former group, and 9.6 years {oF $ 
Mean paid for life insurance production eo 
calendar year 1956 was $696,000 for the 
group, and $306,000 for the “unsuccessfu 


246 


Sales Comprehension Test and Sales Motivation Inventory 247 


The Criterion 


The measure of performance used in this study 
Was the paid-for life insurance production for the 
calendar year 1956, 


Results 


Group comparisons of Sales Motivation In- 
ventory scores, Sales Comprehension Test 
Scores, and a combination of these two scores 
Were made to determine if there were signifi- 
oe differences between the “successful” and 

Unsuccessful” life insurance salesman. 
eg t technique was employed to determine 
score cance of difference between group mean 

S, utilizing where applicable a correction 
‘mula that takes into account the hetero- 


$eneity of variance (Guilford, 1956). Deter- 
ti Nation of the existence of common popula- 
ay Variance was determined by applying the 


" test, 


Sales afore 4: 
les Motivation Inventory 


Scores for “successful” salesmen range from 
aa: 12 on the Sales Motivation Inventory, 
on oe the “unsuccessful” group, the range 

this instrument was 57 to —13. The 

and respectively, were 35.54 and 25.20, 
Ds were 11.70 and 18.10. 

e resulting F of 2.39 is .01 short of the 

si level of confidence, and the ¢ of 2.63 is 

Bnificant at the 05 level (2.07 required). 
he findings suggest that the Sales Moti- 

‘on Inventory is capable of differentiating 
se competent life insurance salesmen 
nen „the less competent life insurance sales- 
der D the geographical area covered, and un- 

e circumstances of this study. 


Qles C 7 
$ Comprehension Test 


he, n the Sales Comprehension Test, scores 
the “successful” group ranged from 57 to 
li While 52 and 7 were the upper and lower 
ks of scores for the “unsuccessful” sales- 
Were The mean and SD in the former group 
28 32.23 and 11.15, and in the latter group 
4 was the mean and 13.5 was the SD. 
Nific, © F here proved to be 1.47, short of sig- 
20g wee at the .05 level. The required ¢ of 
obt, for significance at the .05 level was not 
ained 


Sag tese findings suggest that, used alone, the 
S Comprehension Test is not a valid dif- 


ferentiator of competence in life insurance 
selling. 


Sales Motivation Inventory and Sales Com- 
prehension Test Combined 


Part of the initial plan of study was to 
combine the two predictors for purposes of 
further analysis. The significant difference 
shown by the Sales Motivation Inventory and 
the direction, though lack of significance, of 
the Sales Comprehension Test scores rein- 
forced the soundness of this approach with a 
suggestion of potential differentiation at a 
statistically significant level. 

Composite scores for each member of the 
total population were obtained by weighting 
the two tests equally. This was accomplished 
by weighting raw scores in inverse proportion 
to SDs of the respective tests. These com- 
posite scores ranged from 130 to 38 in the 
“successful” group, and 111 to 15 in the “un- 
successful” group. Means and SDs for the 
“successful” and the “unsuccessful” groups 
were respectively 83.70 and 67.60. The re- 
spective SDs were 22.16 and 20.17. 

In this situation, a ¢ of 2.66 is required for 
significance at the .01 level. The obtained ¢ 
of 2.73 exceeds this, indicating that there is 
less than one chance in 100 that the means of 
these samples are not significantly different. 


Summary and Conclusions 


Sixty ordinary life insurance agent volun- 
teers were obtained from 17 companies op- 
erating in Nebraska. All were experienced 
life insurance salesmen. The population was 
dichotomized unequally into “successful” and 
“unsuccessful” groups on the basis of insur- 
ance sold during the previous calendar year. 
All men completed the Sales Motivation In- 
ventory and Sales Comprehension Test. 

For both groups, ¢ and F tests were com- 
puted, and a £ test was applied to a com- 
bined test score. 

Status validity in this situation was shown 
for the Sales Motivation Inventory to the ex- 
tent of a £ significant at the .02 level and an 
F significant at approximately the .05 level. 
The Sales Comprehension Test failed to show 
validity at accepted significance levels. 

Scores of the two instruments, when com- 
bined, yielded a ż significant at the .01 level. 


248 


The applicability of these findings to life 
insurance selling in general is questionable in 
the absence of further research because of 
the restricted geographical distribution of the 
population employed. 

Adequacy of criteria is a traditional re- 
search problem, and one that merits ques- 
tioning in this study. Reliability might be 
obtained by correlating the criterion year’s 
production with previous or subsequent years, 
Positive relationship would enhance confi- 
dence in the results. 

Equivalence of territories appears to be a 
reasonable assumption, but is not known as a 
fact. Subsequent studies might well control 
this factor more adequately. There are also 
the names of the companies in which the men 
sell. These may have advantages or disad- 
vantages worthy of control in order to better 
equate populations studied. 

A single criterion was employed in this 
study. Greater validity and reliability are 
potentially available through multiple criteria 
when evaluating predictors for complex jobs 
such as that of salesmen. A larger N is cer- 
tainly to be sought in subsequent studies, 

The results of this study, in spite of limita- 
tions of design and content, suggest Possible 
value for the Sales Motivation Inventory and 


At least one such stud 
with the Sales Comprehension Test involving 


9000 job applicants js now in progress, 
Received September 25, 1958, 


References 


Austin, R. L. Selection of sal 
of research. Unpublished d 
diana Univer., 1954, 

Bass, B. M. Validity information exchange, No, 10 
25. Personnel Psychol., 1957, 10, 343-344. 


€s personnel: A review 
octoral dissertation, In- 


Bruce, M. M. Examiners manual, Sales Compre- 
hension Test, Form M. New Rochelle: Author, 
1953. (a) 


Bruce, M. M. Examiners manual, Sales Motivation 
Inventory. New Rochelle: Author, 1953. (b) 


Lester E. Murray and Martin M. Bruce 


Bruce, M. M. A sales comprehension test. J. appl. 
Psychol., 1954, 38, 302-304. (a) No. 
Bruce, M. M. Validity information exchange, 
7-022. Personnel Psychol., 1954, 7, 157. a 
Bruce, M. M. Normative data information aan (a) 
No. 19. Personnel Psychol., 1956, 9, ae 
Bruce, M. M. Normative data information Tb) 
No. 20. Personnel Psychol., 1956, 9, 397. O a 
Bruce, M. M. Normative data information ee 
No. 21. Personnel Psychol., 1956, 9, 398. Se ni 
Bruce, M. M. Normative data information ven 
No. 22. Personnel Psychol., 1956, 9, 399. ( hanger 
Bruce, M. M. Normative data information ne le 
No. 23. Personnel Psychol., 1956, 9, oN E 
Bruce, M. M. Normative data information a 
No. 24. Personnel Psychol., 1956, 9, 402 E No. 
Bruce, M, M. Validity information exchange, 
9-45. Personnel Psychol., 1956, 9, 523-52 prehen- 
Bruce, M. M. Examiners manual, Sales Comrnellet 
sion Test, Form M, Supplement. New 
Author, 1957. (a) toe ail 
Bruce, M. M. Validity and normative 1 Ps 
mation exchange, No. 10-23. Personne! 
1957, 10, 245. (b M Asso 
Bruce, M. M. Enni manual, Aptituda ie o 
ciates Test of Sales Aptitude (Basic ar elle: AY 
Selling) Form A. (7th ed.) New Roch £ 
thor, 1958. z ation 
Bruce, M. M., & Friesen, E, P. Validity ae a 
exchange, No. 9-35. Personnel Psychol j 
380. astiti” | 
Bruos, O, K, (Ed.) The fourth mental 'ryphon 
ments yearbook. Highland Park, N. J: °° 
Press, 1953 à 
Gray, E. J, & Rosen, J. C. Validity I 
exchange, No. 9-7, Personnel Psychols 
IIZ. a, 
Guilford, J. P, Fundamental statistics in Fi 950. 
and education, New York: McGraw-Hill ; y 


ta infor- 
ychols 


y 
ay 


Harless, B. B., & Bruce, M. M. Normativ? eyo 
formation exchange, No. 10-8. Personne : 
1957, 10, 104, son data 2 7 

Hecht, R, & Bruce, M. M. Normative da5 chal 
mation exchange, No, 10-8. Personne J 
1957, 10, 536, sve data infor! 

Hecht, R., & Bruce, M. M. Normative psych? 
mation exchange, No. 11-4. Personne! 
1958, 11, 133, selec! | 

Husband, R. W. Techniques of silesia, p 
Educ. psychol, Measmt, 1949, 9, 129-1 e | 


gor | 


a . mo 
Kennedy, Js Be A general device versus 4 


Psychol., 1958, 42, 206-209. -ye dat ali 
Murray, L. E., & Bruce, M. M. Normat 4 f 
formation exchange, No, 10-9. Personn if 
1957, 10, 105-106. (a) “pol 


formation exchange, No. 10-9. Personne 

1957, 10, 107-109. (b) ye; NO 
Speer, G. S. Validity information exchan® 

16. Personnel Psychol., 1957, 10, 206. s by m 
Super, D. G. Appraising vocational Atna ers 

of psychological tests, New York: Ha 


Journal o 


Vol. 43, eatea: Psychology 


No. 4, 


Since the a 
lation has 
Public atte 


utumn of 1957, subliminal stimu- 
been the object of considerable 
by th ntion, The furor was instigated 
1958). Widely publicized Vicary (Brooks, 
by th study in which sales were increased 
loner Presentation of the phrases “Eat Pop- 
a seco and “Drink Coca-Cola” at 1/3000 of 
| fee on a movie screen before an unsus- 
and & audience. Subsequent public fears 
the pronouncements have far out-distanced 
pirical data. 


Problem 


Mia oa of subliminal perception is an 
ong b familiar one to psychology. It has 
fected o known that behavior may be af- 
Net y by stimuli of which the individual is 
ne aware. Beginning in the nine- 
cumul century, laboratory evidence has ac- 
Curae ated to reveal greater than chance ac- 
tor: y in the discrimination of visual, audi- 

mi gu olfactory stimuli rendered subliminal 

A p auge (Sidis, 1898; Stroh, Shaw, & 
lug , 9, 1908), low intensity of the stimu- 
| Coyne, King, Zubin, & Landis, 1943; 


Hon: 

Figg 8Worth, 1913; Laird, 1932; Miller, 

toundi illiams, 1938), low intensity of sur- 
inte 08 illumination (Baker, 1937), high 
tensit 2 


Yy of surrounding illumination (King, 


on & Zubin, 1944), and lack of atten- 

alin outer, 1940). In addition, research 

Withor, with perceptual defense and learning 
ou 


ii awareness involve response to stimuli 
Sac below the Ss’ conscious thresholds. Lit- 
~ Wre in these research areas has been re- 


1 
tute unds for this study were provided by the Insti- 
. Versity” ommunication Research of Stanford Uni- 
Schyihe author wishes to express his thanks to W. 
Soltear n for his interest and support and to his 
M. Sues R, Haber, L. Petrinovich, H. Robinson, 
The Wan and D. Shannon for their cooperation. 
Ran, ork of the research assistants F. Burkart, W. 
k pam F. Kopache, D. Minor, J. Murphy, 
erger, J. Palmer, S. Perry, B. Powell, N. 
- Starrett, T. Stephens, and E. Sweeney 
efully acknowledged. 


NS 
otda, 
© so grat 


249 


THE EFFECT OF A SUBLIMINAL FOOD STIMULUS 
ON VERBAL RESPONSES ' 


DONN BYRNE? 


San Francisco State College 


viewed by Adams (1957), Lazarus and Mc- 
Cleary (1951), and McConnell, Cutler, and 
McNeil (1958). 

None of these laboratory experiments has 
given rise to public concern. However, many 
persons have recently become frightened by 
the idea that subliminal stimuli may be ac- 
tively manipulated in an effort to influence 
human behavior in some predetermined man- 
ner. Fears have centered in the possible un- 
scrupulous and unethical use of the tech- 
nique by advertisers and politicians. Visions 
of 1984 and Brave New World have been met 
by reassurances that no one can be made to 
do something personally unacceptable, that 
the stimulus can only act as a reminder for 
an existing desire, etc. What seems most 
clear is that extensive research is needed. 

The present investigation represents an at- 
tempt to isolate and study a few of the pos- 
sible variables. With a food word as the 
subliminal stimulus, four hypotheses were 
formulated: (a) verbal references to the 
stimulus word are increased; (b) in a choice 
situation, the stimulus object is preferred; — 
(c) subjectively perceived hunger is greater; 
and (d) each of these effects is greater under 
conditions of high physiological hunger drive. 


Method 


The Ss were 105 (45 male, 60 female) students in 
four required freshman psychology classes. The ex- 
perimental and control groups each consisted of one 
11:00 a.m. and one 1:00 P.M. class. 

A 16-minute movie, Controlling Behavior Through 
Reinforcement, was shown to each group. For the 
experimental Ss only, the word “beef” was super- 
imposed on the screen every seven seconds in flashes 
of 1/200 of a second duration.® 


3 An Eastman-Signet Slide Projector (Model I, 
500 watts, 35 mm.) was used, with a five-inch f3.5 
Kodak Ektanon lens. A new automatic shutter sys- 
tem was designed and constructed by James F. Lee, 
94 Willow Rd., Menlo Park, California. The shutter 
system, itself, consisted of a Compur Rapid 2.2 
shutter with a maximum opening of 63/64 inches. 
All of the. apparatus was placed in a soundproof 
projection box. 


250 


Immediately following the movie, two student Es 
entered the room and asked permission to administer 
a “Health Inventory.” The Ss rated their hunger on 
a five-point scale (not at all hungry to very hun- 
gry), responded to a brief sentence completion and 
word-association test, checked their sandwich pref- 
erence (tuna, hamburger, cheese, steak, or roast 
beef), and indicated the time at which their last 
meal was eaten. Identical items concerning fatigue 
and smoking were included as buffers. Afterward, 
the Ss were asked their reactions to the movie and 
whether they noticed anything unusual about it. 


Results 


Of the original 108 Ss who took part in the 
experiment, two saw the word “beer” and one 
saw “beef” flashed on the screen, so they 
were eliminated from the sample. The re- 
maining Ss gave no evidence of having per- 
ceived the stimulus. 

First hypothesis. The responses to the sen- 
tence completion and word association tests 
did not contain a sufficient number of beef or 
meat references to yield scorable categories. 
Subliminal stimulation did not increase ver- 
bal references to the stimulus word. 

Second hypothesis. A higher proportion of 
the experimental than of the control group 
(.37 and .28) chose roast beef in preference 
to the other four sandwich types. However, 
this difference was not statistically significant 
(x? = 1.10, P > .05). The difference almost 
reached statistical significance with the fe- 
male Ss (x? = 3.70, P< 10) but not at all 
for the male Ss (x? = .02, P > .05). The sex 
difference in choosing the roast beef sand- 
wich in the control versus the experimental 
groups was a significant one (x? = 8.46, P < 
01). The subliminal presentation of the 
word “beef” did not influence food prefer- 
ences as measured by a paper and pencil 
device. 

Third hypothesis. The experimental Ss 
rated themselves hungrier than did the con- 
trol Ss to a statistically significant degree 
(F = 11.00, df = 1/101, P < .01) as tested 
by analysis of variance. An attempt was 
made to control the physiological hunger 
state by using each group at the same times 
of day. However, there were group differ- 
ences in hours of food deprivation, and this 
latter variable was also significantly related 
to the hunger ratings (r= .21, P < .02). 


Donn Byrne 


Table 1 


Comparison of Experimental and Control G 
Differences on Hunger Self-Ratings an 
on Hours of Food Deprivation 


aroup 


Iimge Scale Hours 
atings a 
~ a AY 
Group N y sp = 
= = “i s 3.42 
Experimental 58 23 1.17 i 3,78 
Control 47 1.6 96 ah 


peri- 

Therefore, the hunger ratings of te pr 
mental and control groups were compara 0 
another analysis of variance, with hoian 
food deprivation controlled by the co sig 
technique. Group differences were ne 
nificant (F = 10.96, df= 1/102, P ad oP 
No significant sex differences were four a 
this variable. The means and standas 
viations are shown in Table 1. Since t? their 
groups did not differ significantly n 
self-ratings of fatigue or desire for a ciè 
it was concluded that the sublimin@ meas” 
stimulus increased subjective hunge", Be 
ured by a self-rating hunger scale. _ tion ° 

Fourth hypothesis. Food deprive oi t to 
200 minutes was a convenient micP roup®: 
divide Ss into high and low drive state R wer? 
The experimental and control aes in 
compared on their sandwich preferent 
the high and low drive subgroups *! 
separately. In neither high drive (X95) 
P > .05) nor low drive (x? = -16 
conditions was there a significant 
in sandwich preferences. 


Table 2 P 
roup 


Comparison of Experimental and Control ae 


ences on Hunger Self-Ratings in 
and Low Drive Conditions 


Group N es 25 
m 4 6t 2 
Experimental-High Drive 33 4.80 yf 
Experimental-Low Drive 25 2.18 6 
Control-High Drive 17 4.27 
Contro]-Low Drive 30 


} 


) 


Subliminal Food Stimulus 


om differential effect of drive state on the 
a ratings for the experimental and con- 
an fete (data shown in Table 2) was 
ine by a two-way analysis of variance 
ee the Walker and Lev approximation 
» PP- 381-382). In addition to the 
Stoup differences reported above, the high 
ie a rated themselves hungrier than the 
001) tive Ss (F= 22.50, df= 1/101, P< 
as but the interaction effects were not 
hus he (F= 1.00, dj = 1/101, P > .05). 
a fie igh drive state was not found to be 
hes essary condition for influence by a sub- 
minal stimulus. 


Discussion 


ae ih results of this investigation 
ects of rs it is possible to affect some as- 
stimuli uman behavior through the use of 
1h eten in exposures too rapid for 
Best i perception. The findings also sug- 
Mercial « too surprisingly, that any com- 
Must sh cic of subliminal stimulation 
simply E, more variables into account than 
respon. e chosen stimulus and the desired 
se. 

len though the projective measures of 
ence t „did not reflect any stimulus influ- 
Varied ìs Possible that more extensive and 
Ee projective devices would have yielded 

., *bected results. 

lites ts of the experimental Ss to 
ore th e roast beef sandwich significantly 
lo ing iat the control Ss suggests the fol- 
Pcie p lanatign, In asking for sandwich 
lisheq hark one is dealing with long estab- 
force, abit patterns which have been rein- 
hypothon: countless occasions. It might be 
lation esized, though, that subliminal stimu- 
ily a determine preferences between 
e, amiliar and desirable alternatives 
a previously unknown political can- 
RA or between items with minimal 
brange ating qualities (e.g—two popular 
Riant toothpaste). The finding of sig- 
Sublimi sex differences in susceptibility to 
ther, inal influence should be explored fur- 


Sin 
Crease e the subliminal food stimulus did in- 
Sonaple n ojectively felt hunger, it seems rea- 
to hypothesize that appropriate stimu- 


251 


lation could arouse thirst, fear, hate, anxiety, 
sexual desire, etc. Should drive arousal prove 
to be the major effect of subliminal stimula- 
tion, it is possible that Vicary’s findings 
(Brooks, 1958) resulted from the evoking of 
hunger and thirst drives; popcorn and Coca- 
Cola were bought simply because they were 
available in the lobby. It would be interest- 
ing to learn whether the sale of other soft 
drinks and of candy bars also increased. 

Finally, the idea that a drive must be pres- 
ent for subliminal stimulation to be effective 
was not supported. Rather, it may be pos- 
sible to use this method to create a need 
where it is absent. 


Summary 


This experiment was undertaken in order 
to test four hypotheses involving the effect of 
subliminal stimulation on human behavior. 
The experimental group saw a classroom 
movie with the word “beef” superimposed in 
flashes of 1/200 of a second every seven 
seconds; the control group just saw the 
movie. It was found that, compared to the 
control Ss, the experimental Ss (a) did not 
show increased verbal references to the stimu- 
lus word; (b) did not choose the stimulus ob- 
ject in a multiple choice situation (though 
sex differences were significant); but (c) did 
rate themselves significantly more hungry. It 
was also found that hours of food deprivation 
did not influence any of these relationships. 


Received September 29, 1958. 


References 


Adams, J. K. Laboratory studies of behavior with- 
out awareness. Psychol. Bull., 1957, 54, 383-405. 

Baker, L. E. The influence of subliminal stimuli 
upon verbal behavior. J. exp. Psychol., 1937, 20, 
84-100. 

Brooks, J. The little ad that isn’t there. 
Rep., 1958, 23 (1), 7-10. 

Collier, R. M. An experimental study of the effects 
of subliminal stimuli. Psychol. Monogr., 1940, 52, 
No. 5 (Whole No. 236). 

Coyne, J. W., King, H. E., Zubin, J., & Landis, C. 
Accuracy of recognition of subliminal auditory 
stimuli. J. exp. Psychol., 1943, 33, 508-513. 

Hollingworth, H. L. Advertising and selling. New 
York: Appleton, 1913. 

King, H. E., Landis, C., & Zubin, J. Visual sub- 
liminal perception where a figure is obscured by 


Consumer 


252 Donn 
the illumination of the ground. J. exp. Psychol., 
1944, 34, 60-69. 

Laird, D. A. How the consumer estimates quality 
by subconscious sensory impressions. J. appl. 
Psychol., 1932, 16, 241-246. 

Lazarus, R. R., & McCleary, R. A. Autonomic dis- 
crimination without awareness: A study of sub- 
ception. Psychol. Rev., 1951, 58, 113-122. 

McConnell, J. V., Cutler, R. L., & McNeil, E. B. 
Subliminal stimulation: An overview, Amer. Psy- 
chologist, 1958, 13, 229-242. 


Byrne 


Miller, J. G. Discrimination without awareness. 
Amer. J. Psychol., 1939, 52, 562-578. 2 

Sidis, B. The psychology of suggestion. New York: 
Appleton, 1898. A 

Stroh, M., Shaw, A. M., & Washburn, M. F. 
study in guessing. Amer. J. Psychol, 1908, 19, 
243-245. 

Walker, Helen M., & Lev, J. 
New York: Henry Holt, 1953. ai 

Williams, A. C., Jr. Perception of subliminal visu% 
stimuli. J. Psychol, 1938, 6, 187-199. 


Statistical inference. 


Journal of Applied Psy x 
Wat of Applied Psychology 


PSYCHOLOGICAL ADJUSTMENT AND THE WORKER 
ROLE: 


AN ANALYSIS OF OCCUPATIONAL DIFFERENCES ' 


LAWRENCE G. COREY 


Industrial Relations Center, University of Chicago 


oy teteni years, the role of worker has taken 
of ded significance, especially in the area 
aging and retirement. Because, in the 
Re ka occupied by necessity so much of his 
gradu N industrial employee’s worker role 
to Ke y became second in importance only 
the a religion. As a consequence, during 
enaa phases of our country’s expanding 
eep a a developing labor force felt some 
ion p entification not only with their occupa- 
Ut with their nation’s pioneering spirit. 
ay is. vigörous industrial pioneer of yester- 
Psych Riss older worker of today, and the 
Ological support he has come to expect 
this from his work-role is somehow 
Mation ed by our recent trend toward auto- 
resulted This automation has subsequently 
Worker? ma depersonalization of the older 
Settin, 5 previously paternalistic industrial 
tered a a depersonalization, which has fos- 
Vironm, Tuctural changes in the industrial en- 
becom ent to which the older worker has not 
~ © reconciled, 
a “it the Problem of whether psychological 
Worker oat is equally dependent upon the 
Hon Tole for two occupational groups, such 
See aa and manual employees, would 
Urgese be of some importance. In fact, 
nman and his associates claim that the 
aging ual employee is better prepared for 
rie retirement, in general, than his oc- 
ur, nal counterpart, the manual laborer 
T eg Corey, Pineo, & Thornbury, 1958). 
the me this paper will attempt to clarify 
Socia] Owing question: Is the personal and 
tofegsi dustment of managers, supervisors, 
Sonn, lonal-technical and clerical-sales per- 
(nonmanual employees) equally de- 
~ EM upon their role of worker, as the per- 


1 
qv 
Purges gs is particularly indebted to Ernest W. 
a antic. his kind counsel and helpful criticism of 
0, wha Pt, and also to his colleague, Peter Gs 
© helped to collect and code the data. 


sonal and social adjustment of skilled, semi- 
skilled and unskilled personnel (manual la- 
borers). 

However, before approaching an investiga- 
tion of the preceding question, it is neces- 
sary to arrive at a suitable definition of the 
worker role. In a general statement of the- 
ory, Sarbin (1956) defines the concept of 
“Role” as “... a patterned sequence of 
learned actions or deeds performed by a per- 
son in an interaction situation.” 

Furthermore, Havighurst (1957), discussing 
the implications of competent role perform- 
ance to personal and social adjustment, writes: 


Consequently, since a social role is both a social 
expectation and a self-expectation, performance in 
the common social roles should be related both to 

. social adjustment in the sense of a smooth ad- 
justment to one’s society, and to personal adjust- 
ment in the sense of satisfactory feelings about one- 


self. 


For the purpose of this analysis, then, it 
will be sufficient to define the worker role in 
the following terms: The work-role is a pat- 
terned sequence of actions or deeds, learned 
for the purpose of utilitarian productivity, 
performed by a person in an industrial situa- 
tion and leading, through their competent per- 
formance, to a satisfactory manipulation of 
that person’s occupational skills, to a social 
adjustment with his fellow workers and to a 
personal self-esteem. 

Competency in the work-role therefore im- 
plies three conditions: first, that the person 
skillfully manipulates the tools of his occu- 
pation; secondly, that he “enjoys” using these 
tools; and thirdly, that he employs these 
tools with a certain “flair,” savoir-faire, or 
inventiveness. Although the measure of 
worker role used in this paper does not spe- 
cifically refer to these three conditions, it was 
designed with them in mind and consequently 
infers a qualitative relationship to them. ` 


253 


w 
on 
P 


Methods 


301 Ss between the ages of 55 and 65 and all em- 
ployees of a Midwestern oil refinery, were divided 
into the classifications nonmanual and manual work- 
ers. The former classification included 116 of the 
company’s managerial, supervisory, professional- 
technical, and clerical-sales personnel, while the lat- 
ter classification contained 185 of the company’s 
skilled, semiskilled and unskilled laborers. 

The Ss participated in a survey made by the In- 
dustrial Relations Center of the University of Chi- 
cago, at which time they anonymously answered 
either Yes, No, or Undecided to the 100 statements 
in the Retirement Planning Inventory prepared by 
Burgess and Mack and published by the Industrial 
Relations Center. 

Eighty-three of the 100 statements in the Retire- 
ment Planning Inventory were written by the au- 
thors, the remaining 17 items were taken from other 
standardized sources. These 17 items constitute two 
of the four measures reported in this paper—“Per- 
sonal Adjustment” and “Job Satisfaction” The 10 
items in the Inventory which presumably measure 
personal adjustment were derived from an item 
analysis of the Cavan and Burgess “Study of the 
Personal Adjustment of Old People.” Job satisfac- 
tion is measured in the Retirement Planning Inven- 
tory by the seven items in the SRA Employee In- 
ventory found to constitute the factorial trait of job 
satisfaction. The other two diagnostic measures in 
this paper—Social Adjustment” and the “Worker 
Role”—were developed from the 80 items written 
by Burgess and Mack. Social adjustment was gener- 
ated from an item analysis (phi) of data collected 
on over 1000 older people. 

The measure of “Worker Role” requires some ex- 
planation. Five judges (hopefully a cross section in 
miniature of society) were asked to go over the 100 
Statements in the Inventory and select those which 
seemed to touch on various aspects of a person’s 
role as worker. The judges were told to rely mainly 
on their own discretion in selecting items. In order 
to standardize the procedure, however, the judges 
were given a set of instructions to use as a basis for 
their selection. In order for a question to qualify 
as a measure of the worker role it had to meet the 
following basic requirements: “Question applies to 
the respondent's present job only as it relates to his 
present economic, emotional and old age role. It 
does not involve any relationships with his family, 
friends or co-workers. It does not apply to his fu- 
ture plans for an occupation.” Questions which in- 
ferred any family, friends, or co-worker relationships 
were purposely excluded from consideration in order 
to keep the measure of worker role as free from 
other variables as possible. A binomial expansion 
was then calculated to determine a point of agree- 
ment between the five judges that would meet or 
exceed the .01 level of confidence. Those items 
which then met or exceeded the .01 level of agree- 
ment between the five judges were accepted as in- 

dicies of the worker role. Five questions were sub- 


L. G. Corey 


Table 1 
Correlation Analysis Between Three Psychologic 
Factors and the Worker Role, by 
Occupation Status 
(Adapted from Corey”) 


al 


Worker Role a 


Non- 


nual 

manual _ 
Pe 7 an = 24 
Personal Adjustment O08 34 
Social Adjustment J 6 

Job Satisfaction 2y _ = 


era - be — on- 
7 vel of ¢ 

* Correlation significant at and beyond the .05 le ter- 
fidence. project to ge 
-age roles. 


viously unpublished data of a larger ai 
mine psychological factors related to various ol 


4 he 
: tute t 
sequently selected by this method and consti These 


measure of worker role used in this paper (prede- 
five questions and their favorable eR Oe upon 
termined by Burgess and Mack) are availal 
request from the author. 


Results 


, ionally d& 
Having theoretically and operation ed t0 
fined the role of worker, we can now eon 1, 
a discussion of our original question. # rod- 
adapted from Corey (1957), presents 4 Pet 
uct-moment correlation analysis in whic j 
sonal adjustment, social adjustment a 4 
satisfaction are analyzed for their Ye ple 1: 
ship to the work-role. Note that in Tai are 
the nonmanual and manual employe that 
treated as separate populations in O" he two: 
comparisons can be made between t gener” 

The relationships in Table 1 can be 
alized as two statements: 

S1: While the personal adjustmen é 
manual employees is not related to t jmum 
tivity in the role of worker, the weed for 
degree of positive personal adjust une 
manual employees varies directly "hat per 
tion of increasing competency in t 
formance of the work-role. nt and 

S2: The maximum social adjust™® anual 
job satisfaction of both nonmanual an n of ii 
employees varies directly as a functlO ance ° 
creasing competency in their perfor™ 
the worker role. petween 

The single difference, therefore: e is he 
the nonmanual and manual employ” is m 
degree to which personal adjustme”! pei 
lated to the competent performance 


t of nom 


S- 


Occupational Differences 


respective work-roles. Since, however, per- 
sonal adjustment has already been defined as 
Satisfactory feelings about oneself,” this es- 
Sential difference is a crucial one. 


Discussion 


It can be concluded from this analysis that 
the worker role, as a source of psychological 
Support for the older employee. is directly as- 
Sociated with the personal adjustment of the 
Manual laborer, while there is little or no 
association between personal adjustment and 
the work-role for the nonmanual employee. 
Furthermore, it can be inferred, as a result, 
that increasing depersonalization of the in- 
dustrial setting has fostered irreconcilable 
Conflicts not for the nonmanual employee, but 
father, for the manual laborer, whose field of 
self-expression, other than his job, is neither 
as diversified nor psychologically rewarding 
as that of his occupational counterpart. 
On the other hand, the relationships in S2 
eed reflect the high premium which an 
Sova a society, regardless of his socio- 

nomic status, places on the worker role. 
Sey event broadly defining “Social Adjust- 
a as an individual’s group participation 

Sus his social isolation, it can be assumed 
at the worker role, among other things, is 
8nificantly related to both the nonmanual 
Manual employee’s social adaptation. 

Re 1S association, however, undoubtedly ex- 
Aliza apart from the industrial deperson- 
othe ion which we previously discussed. In 
+... Words, it is not inconceivable that an 
Be job may qualify him for social 
See ee as a productive and economically 
Provig member of his society, but it may not 

é nde him with an immediate personal satis- 

extent As we have already mentioned, the 
t to which such a personal satisfaction 

k Adjustment proceeds from the worker role 

ie in a large part on the individual’s 

Pete tional status and his work-role com- 
cy, 


Si 


N 
un 
an 


Summary 


301 Ss, all between the ages of 55 and 65, 
were divided into two occupational statuses, 
nonmanual and manual workers. The former 
status included 116 of the managerial, su- 
pervisory, professional-technical, and clerical- 
sales personnel of a Midwestern oil refinery, 
while the latter status contained 185 of that 
company’s skilled, semiskilled, and unskilled 
laborers. Both groups were then treated as 
separate populations in an analysis of the 
worker role as it related to personal adjust- 
ment, social adjustment, and job satisfaction. 
It was found in the population studied that 
the personal adjustment of nonmanual em- 
ployees was not significantly related with their 
work-role competency, while the personal ad- 
justment of manual employees showed a sig- 
nificant correlation with the worker role vari- 
able. Both social adjustment and job satis- 
faction were significantly related with the 
worker role regardless of occupational status. 
It was therefore concluded that the degree to 
which personal adjustment is related with the 
worker role depends to some extent upon an 
employees’ occupational status. 


Received September 30, 1958. 


References 


Burgess, E. W., Corey, L. G., Pineo, P. C., & Thorn- 
bury, R. Occupational differences in adjustment 
to aging and retirement. J. Geront., 1958, 13, 
203-206. . 

Corey, L. G. Role contingency: The relationship 
between certain psychological factors and social 
roles in old age. Unpublished research memoran- 
dum of the Retirement Planning and Preparation 
Program, Industrial Relat. Cent., Univer. Chicago, 
1957. 

Havighurst, R. J. The social competence of middle- 
aged people. Genet. psychol. Monogr., 1957, 56, 
297-375. 

Havighurst, R. J. The meaning of work. Industrial 
Relat. Cent., Univer. Chicago, 1954. 

Sarbin, T. R. Role theory. In G. Lindzey (Ed.), 
Handbook of social psychology. (2d ed.) Cam- 
bridge: Addison-Wesley, 1956. 


I oj Applied Psychology 
Ces. Sate, 1959 


FACTORS IN SUPERVISORS’ PERCEPTIONS OF 
PHYSICAL SCIENCE RESEARCH PERSONNEL 


ROBERT E. STOLTZ 


Southern Methodist University 


A commonly used criterion of productivity 
in areas where there appears to be a lack of 
a more objective criteria is the rating of a 
person’s work performance by a superior. 
Many studies have pointed out the factorial 
complexity of these ratings in widely diverse 
work areas. To date little has beeen done to 
investigate this problem in the area of physi- 
cal science research work. The present study 
is an initial exploratory attempt to determine 
whether or not such complexities exist when 
supervisors are asked to describe the behavior 
of research workers in physical science fields 
and to tentatively identify any such factors 
that might be found. 


Method 


Intensive interviews were held with each of 27 re- 
search supervisors in a large, Midwestern research 
organization. This organization is devoted almost 
entirely to research in problems of the physical sci- 
ences and engineering. The organization is divided 
into divisions, each of which conducts research in a 
specific subject matter field. The persons interviewed 
were all heads or assistant heads of these divisions. 
The interviewed persons were asked to describe in 
detail the research behavior of the most productive, 
least productive, and most creative man in their di- 
vision. The interviews might be described as some- 
what nondirective and were based on a modification 
of the Flanagan Critical Incident technique. The 
divisions covered by the interviews were engaged in 
research in such diverse areas as ceramic chemistry, 
reactor metallurgy, mechanical engineering, chemical 
engineering, and nonferrous metallurgy, 

From the interviews, a pool of over 300 state- 
ments, or items, was extracted on an a priori basis 
by the investigator. With the aid of eight psycholo- 
gists 225 of these items were grouped into 15 clusters 
of 15 items each. These items, with 25 additional 
items which were felt by the judges to be of interest, 
but which could not be categorized, were assembled 
into a checklist. This checklist was termed the Pro- 

ductive Behavior Checklist (PBC). Each page of 
the checklist contained 30 items and the Pages of 


1 The author would like to thank the Battelle Me- 
i i i tion and assistanc 
orial Institute for their coopera a 
i conducting this study, and C. L. Shartle, H. B. 
Pepinsky, and R. J. Wherry for their advice and 
; 
technical assistance. 


w 


; vay that no 
the booklets were assembled in such a way 


er. 

two checklists contained the pages in the ama 
This was done in order to avoid any ppa to 
ency of the raters to evaluate items accordi 
their serial position of presentation. r h divi- 

Forty heads and assistant heads of ec were 
sions, including 13 of the original (prevni A de- 
given two copies of the PBC and inate N most 
scribe the research behavior of two persons, their di- 
productive and the least productive man aa to rate 
vision. Twenty of the supervisors were tol told t0 
the most productive man first and 20 WEN ail was 
rate the least productive man first. Each i to 
to be rated on a five-point scale according 
well the item described the man in question. 
ing of five indicated an item that was very ch 
tive of the man being rated. A score for ea ae 
ter was obtained by summing the ratings 
items grouped within that cluster. 

Product-moment intercorrelations W 
between each item and each of the pa 
The entire set of 250 items was factor ane. factors 
the Wherry-Winer technique (1953) and bem 
were rotated to orthogonality and simple st" 


descrip” 
clus- 
the 


ure 


Results since 
Five significant factors were obtained. ive 
it is impractical in the present space + Joa 
all of the items and their complete pes aide 
ings, only a sample of those items tha sen 
in identifying the factors will be the 
here.* Table 1 shows this sample ote: 
loadings of each item on the five facto re! 
The most difficult factor to ages 
Factor I. While precautions were H ey o 
avoid as much as possible the diffic K 
encountering halo effect, it wae 
alistic to assume that it was an i 
pletely. Consequently, Factor I, Nae 
high positive loadings for the bul 
items, might be best regarded as 4 a rom 
*A table giving a complete list of the Lage ading” 
the checklist and their final rotated facto en 


- ocu™®Ă ihe 
has been deposited with the American D from ca 
tion Institute. Order Document No. *photoduP 25: 
ADI Auxiliary Publications Project, shingt?”. cyo- 
tion Service, Library of Congress, Wa i: m 
D. C., remitting in advance $1.25 for 35 ake Ci ar 
film or $1.25 for 6 X 8 in. photocopies- vice, 
payable to: Chief, Photoduplication Ser 
of Congress. 


is 
to 


56 


Physical Science Research Personnel 257 
Table 1 
Factor Loadings of Selected Sample of Items From PBC Grouped by Major Loadings 
Factor 
E Item I H THI TN Vv 
Can organize the work of others 84 —02 14 14 08 
Can tell good from bad ideas 82 —06 16 10 17 
Good analytical ability s&s —06 16 05 23 
Does things on his own 81 —13 25 =01 24 
Makes good use of time 82 —01 29 00 13 
Is realistic in estimating a situation 87 10 20 10 11 
Comes up with new ways of doing things 83 —20 32 —07 36 
Reports do not need to be rewritten 83 —01 12 28 —02 
Does not offend others 13 63 -12 sli = 
Is not irritable 15 52 —16 —06 05 
Makes friends readily 22 51 it =e 11 
Has no difficulty working with others 48 50 00 —04 04 
At times may upset others Al -8 21 —04 o1 
Wants recognition 02 —31 04 07 —06 
Has a need to be recognized 0t  =32 05 05 —01 
Is not impatient with others —16 2-34 01 06 
Will break his back to produce 69 06 50 -15 00 
Willing to put in extra time 68 03 49 =13 03 
Worries about how the job is going 42 —13 42 =) —10 
Has real interest in job 78 -02 4 —02 00 
Takes part in many social activities —12 19 —33 02 08 
Has a lot of outside interests 10 13 -25 01 10 
Does not fly off the handle 10 54-22, —06 02 
Vrites so anyone can understand it 74 09 02 32 —06 
Is not hard for him to write reports 70 —01 02 31 00 
Seldom need to make changes in his reports 83 00 07 29 00 
N express himself in everyday terms 67 07 00 20 02 
Can make things clear to others 76 06 04 20 05 
es out of his way to help people 39 42 07 24 16 
Is liked by others — 40 52 00-17 11 
Ts not afraid to go to experts for help 49 26 13 —16 22 
Tas lots of ideas 35 -27 2—07 43 
aš imagination 74 =% e =i H 
Has developed creative ability 85 -15 23 —02 32 
hinks there must be a better way of doing things 67 —18 20 —05 32 
zan evaluate alternative approaches to the problem 86 —05 18 04 25 
“Xtremely neat and orderly 35 13 23 10 =35 
Likes routine work ? —58 o -08 —0 38 
an't stand to be unsuccessful 05 —29 08 16 —22 
ee = 
Nate An decimal points have been omitted. 
fa 
tee indicating the extraction of halo ef- vorably. it might be expected that Factor I 


ho sh Owever a number of the items do 
top ņ ©W high positive loadings on this fac- 
> tt do appear as the major contributors 
xtractedt II. Since the items were initially 
s Parent from the interviews because of their 
and sir relevance to the productive process 
nee 95% of the items were worded fa- 


would represent a general validity factor in- 
dicating the extent to which each item is 
evaluated in describing a productive 

Within the framework of the design 
used in this study it would be best to inter- 
pret this factor as describing General Produc- 
tive Behavior and in a sense confounding halo 


highly 
person. 


258 


effect and a crude validity index. The items 
which showed the largest loadings on this fac- 
tor dealt with analytical thinking, technical 
knowledge and skill, work-oriented organiza- 
tional ability, willingness to assume responsi- 
bility and take independent action, and tech- 
nical report writing skill. Items showing low 
or negative loadings on this factor dealt with 
interest in social activities, liking for routine 
or administrative work, agreeableness and 
pleasant personality features, and feelings of 
personal inadequacy. 

Factor II has been tentatively named Affa- 
bility. It appears to contain items dealing 
with behaviors that tend to make one liked 
by others. Persons rated high on this factor 
would seem to be agreeable, pleasant and 
good group members. It is of interest to 
note that the items which load on this factor 
that deal with freedom from aggressive acts 
or sensitivity to other’s feelings generally show 
low or negative loadings on the General Pro- 
ductivity factor. While not investigated in 
this study a suggested hypothesis for future 
study might be that the productive research 
worker is more prone to aggressive attacks on 
others in his work group than the nonproduc- 
tive researcher. In the light of the high 
value given behaviors reflecting strong moti- 
vation on Factor I, we might expect these ex- 
pressions of aggression to be specific to situa- 
tions in which the expression of productivity 
is blocked by some person or thing and to 
be more common among the more highly 
motivated productive researcher. This inter- 
pretation, or hypothesis, would appear to be 
consistent with current frustration-aggression 
theory. 

Factor III has been tentatively named Mo- 
tivation. The items receiving the largest load- 
ings on this factor deal predominantly with 
industriousness, willingness to exert effort, and 
interest in the job. The major negative load- 
ings on this factor come from items dealing 
with patience, calmness, and control of tem- 
per. The supervisor apparently sees the highly 
motivated researcher as one who is more prone 
to some expression of anger. Also negatively 

related to this factor were items dealing with 
the social activities of the Ss both within and 


Robert E. Stoltz 


without the company framework. The items 
loading on this factor tend to show high pos 
tive loadings on the General Productivity 
factor. A 
Factor IV reflects the relevance of com 
munication to the supervisors. The iten 
showing the highest loadings on this facto 
are primarily concerned with the ability a 
the person to write effectively and to This 
municate his ideas clearly to others. F 
factor has accordingly been named Ability 


z : ible 
Communicate. The existence of a POSA 
stereotype among the supervisors is ind! this 


by the fact that persons rated high on on 
factor might be expected to be rated low 
his willingness to help others. 
As might be eede in a work area a 
as this, a factor describing creative eer 
was obtained. Factor V is made up Pel 
pally of items dealing with versatility, 1m4? 
nativeness, and ingenuity. This factor fi 
been named Creative Ability. Again hes in 
what might be a supervisory stereotyp i 
that items dealing with liking for roal 
work, neatness, orderliness, and metho 
ness are negatively related to this facto™ 


Summary and Conclusions 


. rs 
Forty physical science research supervi- 
described the behavior of productive aP® „49. 
productive research personnel using a ith 
item checklist derived from interviews the 
research supervisors. A factor analysis z in 
items comprising the checklist resu factors 
finding five significant factors. cer uc” 
have been tentatively named General com 
tivity, Affability, Motivation, Ability t° 
municate, and Creative Ability. develo?” 
These dimensions will be useful in C% ess 
ing rating scales to more adequate 7a jnst 
the research behavior of persons in thi aly? 
tution and perhaps in others. ot 
also provides material for several WE die 
which might be investigated in other 


Received October 10, 1958. 


` Reference facts 
j m sthod foF 4935: 
Wherry, R. J., & Winer, B. J. A met etrikt 

ing large numbers of items. Psychom 


18, 161-179. 


Journal of Applic yi š 
SE, alti oven 


EVIDENCE OF A PRACTICE EFFECT ON THE 
MILLER ANALOGIES TEST 


CHARLES D. SPIELBERGER 


Duke University 


The Miller Analogies Test (MAT) is cur- 
rently widely employed in the selection of 
Psychology graduate students. Nearly half 
> the institutions which offer advanced de- 
ee in psychology and almost two thirds of 
> ose which grant the Ph.D. require or recom- 
nend that the MAT be submitted with appli- 
cations for graduate training (Moore, 1957). 
= demonstrated validity of the MAT in 

© prediction of academic success (Cureton, 
ee A Bishop, 1949; Fahey, 1953; Jen- 
1982 1953; Kelly & Fiske, 1951; Miller, 
ae Suggests that individuals performing 
tint Y on this test are likely to be viewed as 

avorable scholastic risks. 
Si e MAT is required of all applicants to 
T aduate programs in psychology at Duke 
nae In a number of instances where 
4 oaas had reported scores on two of the 
aN forms of the test, a marked im- 
mini ment from the first to the second ad- 
noten ot of the MAT was frequently 
gs a. Since the MAT is in wide general 
tae since its employment often involves 
Porty Scores below which an applicant’s op- 
ous} "mity for graduate training may be seri- 
me Prejudiced, the following studies were 
signed to test the validity of the observa- 


ti 2 
on that scores on the MAT improve with 
Practice, 


Experiment I 


X 

mitoa, Form H of the MAT was group ad- 
dents cred to 20 first-year psychology graduate stu- 
test êt Duke University in the fall of 1936. The 


that T given under standard conditions! except 
Would ne Ss were specifically told that the results 
the gs he used only for research purposes. All of 
eentay ad taken Form J of the test prior to their 
eee for graduate training. 
a à 

Duets author is indebted to Robert Colver of the 
al Niversity Bureau of Testing and Guidance, 


the shied MAT testing center, for his supervision of 

Barc in anistration of the MAT and his suggestions 

terest GB research design. The cooperation and in- 

Poration Harold Seashore and the Psychological Cor- 

Searc im making the MAT available for this re- 
is Sratefully acknowledged. 


Results. The means* and SDs for Forms 
H and J and the correlation between these 
forms * are presented in Table 1 where they 
are compared with similar data on 135 gradu- 
ate students reported in the MAT Manual 
(Miller, 1952, p. 6). The difference between 
the means of Form H and Form J was found 
to be highly significant (¢ = 4.01; p < .001). 
Seventeen of the twenty Ss improved their 
scores on taking the test a second time. This 
finding suggested two possible interpretations: 
(a) Form H was easier than Form J, or (b) 
there was a practice effect in taking the MAT 
which resulted in improved performance after 
an initial experience with the test. 

Individual interviews with 18 of the Ss 
tended to support the latter hypothesis, and 
suggested that the improvement was due to 
greater familiarity with the nature of the test. 
Two thirds of these Ss reported that they be- 
lieved they had done better on the second 
test. Although the improvement was at- 
tributed to many factors, the Ss stated most 
frequently that they “knew what to expect” 
the second time they took the test. A num- 
ber of Ss also reported that they were less 
anxious or nervous on the second test and 
several specifically related their decreased 
anxiety to greater familiarity with the test. 
Other reasons given to account for felt im- 
provement by at least two Ss were: did not 
finish the first time; second form of the test 

2A correction is required in order to make scores 
on Forms J and H equivalent to scores on Form G. 
This correction, which consists of adding two points 
to Form H and Form J scores in the 30 to 70 raw 
score range, was made wherever required in the pres- 
ent study. 

3 Although the coefficient of equivalence in the 
present study was substantially lower than those re- 
ported in the MAT Manual, a comparison of the 
SDs in Table 1 suggested that this could be at- 
tributed to the restricted range of talent consequent 
to the employment of the MAT as one of the cri- 
teria in the initial selection of these Ss. When re- 
calculated with appropriate corrections for the re- 
stricted range, the obtained reliability coefficient was 


consistent with those reported between alterna 
forms in the MAT Manual (Miller, 1952), TAS 


259 


Charles D. Spielberger 


Table 1 


Means and SDs for Two Alternate Forms of the Miller Analogies Test and 
ý Correlations Between These Forms 


: a 
Form J Form H 
Sample N Mean SD: Mean SD 7 F at 
Duke Students 20" 7255 65 78.00 70 = 
Normative Ss (Miller, 1952) 135 58.5 15.9 57.3 16.0 : 


additi: i + one S had previously taken Form G a 
a In addition to Form J, one p racen Borin Ga 


vhich part of the data of Table 1 is based. 
Tne difference remained significant at the .001 level, 


was easier; pondered less over ambiguous 
analogies which permitted better utilization 
of the total time. Only three Ss felt they did 
worse on the second test and each believed 
that this was because the second form of the 
test was harder. However, all three of these 
Ss actually improved. In general, there ap- 
peared to be little consistency between the 
extent of the change and the Ss’ estimates of 
their own performance. In order to cross- 
validate and test the generality of the find- 
ing of improvement in MAT score on retest 
a second experiment was performed. 


Experiment II 


Method. Forms G and H of the MAT were ad- 
ministered in counterbalanced order to a second sam- 
ple of 17 first-year graduate students in the fall of 
1957. Since all of the Ss had taken Form J prior 
to their admission to the University, it was Possible 
to constitute two groups matched on the basis of 
their scores on this form. Nine Ss (Group I) were 
first given Form H followed after two weeks by 


Form G; eight Ss (Group IT) Were given these same 
forms in reversed order. 


Results. The means and SDs f 
Ss for three successive administrations of the 
MAT are presented in Table 2. The mean 
score obtained for the Ss on their second 


or all 17 


Table 2 
Means and SDs for Three Successiv 
of the Miller Analogies 

N = 17) 


€ Administrations 
Test 


Mean SD 


78.94 59 


m 
inistration of For! 
nd another Form H prior to the prouncatlininie tyes of 78.89 
ese Ss yielded means for Form J of 73.50 and For 


experience with the MAT was significant 
higher than the mean score for their Ol : 
performance on the test (4 = 3.35; p <- the 
The hypothesis of no difference heno test 
mean scores for the second and ae 
performances could not be rejected. jm- 
findings affirmed the generality of the TAT 
provement on an alternate form of the Form 
subsequent to an initial experience with i ii 
J, and suggested that little additiona king 
provement was likely to result from ta ible 
the test a third time. Although the por the 
lack of equivalence between Form J an t be 
other forms of the MAT still could n° an 
ruled out,* the equivalence of Forms h they 
H and the effects of the order in which lysis 
were taken could be evaluated by Z026, 
of variance (Lindquist, 1953, pp. 26 forms 
simple latin square design) since these der: 
had been given in counterbalanced p on 
The means and SDs for Groups I and ble» 
Forms G and H are presented in Ta t sig- 
The F test of the effect of order was ther? 
nificant which further indicated that om 
was no appreciable practice effect err 
Second to a third administration of the sacl! 
Although Form G tended to be more di this 
than Form H for both groups of 3 ili 
difference only approached statistical experi 
cance (F = 3.89; p < 10). A third i ms 
ment, in which the effect of alternate viv 
*Tt was not possible to test directly the Tent” 
lence of J and the other forms since it is ane v 
the current Practice at MAT testing centers `f th 
Form J, the most 
test, to most ne 
it w 
and 


: ce 
This type of experiment yale 
ously provide a test of © 

e effect of practice. 


Practice Effect on the 


Table 3 


Means and SDs for Forms G and H of the Miller 
Analogies Test Given in Counter- 
balanced Order 


(.Y = 8 in each group) 


Form H Form G 
Sample Mean SD Mean SD 
Group Is 80.50 8.08 
Group T1 80.75 7.60 
^ For this analysis one of the < n Group I was randomly 


elimi; 7 n 
minated in order to have equal Ns in each group. 


Was taken into account, was designed to fur- 
t er evaluate improvement on the MAT after 
an initia] experience with the test. 


Experiment Il 


qp arkoa, Forms H and G were administered to 
dents Senior Psychology undergraduate honors stu- 
Rone Who had no previous experience with the test. 
by F H was given first followed after two weeks 
aia orm G. Since Form H had been previously 
Sige ees to be cither equivalent or slightly 
Gir than Form G, any lack of equivalence be- 
mining ees forms would presumably operate to 
Wer mize a practice effect, ie, if an easier form 

© given prior to a more difficult form, a higher 
then Score on the easier form would be expected if 

Were no practice effect. 


Results. The mean scores for Forms H 
and were 70.55 and 75.27, respectively. 
Peg difference between these means was 
T ustically significant (¢ = 2.73; p < 05). 
aE finding of improvement in an alternate 
WE of the MAT after an initial experience 

ith an equivalent or easier form of the test 


Miller Analogies Test 261 
gives strong evidence of a practice effect. Ad- 
ditional evidence of the consistency of the 
improvement, and the high reliability of the 
MAT, was the correlation of .92 between 
Form G and H for the undergraduate Ss. 


Discussion 


To the extent that Form J was equivalent 
to Forms G and H, the data from the three 
samples in the present study consistently in- 
dicated a substantial practice effect on the 
MAT. These results contrast markedly with 
the equivalence between alternate forms re- 
ported in the MAT Manual (Miller, 1952) 
for three large independent samples of gradu- 
ate and senior undergraduate students. A 
possible explanation of the apparent discrep- 
ancy in results might be found in the differ- 
ences between the populations sampled. The 
Ss in the present study were all psychology 
students, a population which might be ex- 
pected to be more sophisticated in taking 
tests. Also, the Ss in the present study scored 
considerably higher than those in the MAT 
normative samples whose mean scores for the 
several forms of the MAT ranged from 49.6 
to 58.5 (Sth to 18th percentile on norms for 
psychologists). It might be speculated that 
bright, psychologically sophisticated Ss would 
be most likely to profit from experience with 
a test such as the MAT, especially if their 
initial scores were depressed because they did 
not know what to expect on the test. 

In situations where the MAT is employed 
with a cutoff score, a low initial score on the 
test is likely to be most significant in the 
evaluation of potential graduate students. 


Table + 


Means and SDs of Improvement Scores on the Miller Analogies Test 


ne ote es eee — 
Improvement Scores Expected 
Percentage ——— Improvement 
of Ss in Percentile 
Gr i f j Mean SD Rank? 
ay ___ Score centiles* y i Enos Ga = i — 
= E 78-9 10 40 -90 54 to7 
B ae oe 23 87 426 5.43 52 to 66 
c a Ta 15 100 9.53 5.02 30 to 54 
D-0" an 


bpe 
the p The 
bag Medi 
Mediona 


atile range is based on the norms for 
Expected improvement in percentile rank 
hot the group and whose improvement 

1e norms for psychology graduate studen! 


ts (Miller, 1952). 


c ; graduate students (Miller, 1952). z = 
p boar ara in terms of the change expected of an S whose initial score 
e equals the mean improvement for the group. 


The percentile ra 


262 


Therefore, the relationship between initial 
score on the MAT and improvement in nie 
on a subsequent administration of an alter- 
nate form of the test was investigated. ad 
this analysis, the data for the three samp a 
of Ss were combined and the total sample 
was redivided into three groups which = 
sisted of Ss whose initial scores were 80 an 
above (Group A), 70 to 79 (Group B), and 
69 and below (Group C). Improvement 
scores were obtained for each S by subtract- 
ing his initial score from his score on taking 
the test a second time. Of the 48 Ss, 39 im- 
proved. The means and SDs of the improve- 
ment scores and the percentage of Ss in each 
group who improved are presented in Table 4, 
where it may be noted that all of the Ss in 
the group with the lowest initial MAT scores 
improved and that this group showed the 
greatest mean improvement. The Ss with the 
highest initial scores did not tend to show 
any systematic improvement in taking the test 
a second time. An analysis of variance of 
the improvement Scores (Lindguist, 1953. 
simple randomized desiom) yielded highhy 
: rences in group i 
Individual ¢ tests 


P< 


Scores. A Pearson Product-moment Correla- 
tion of — 50 indicated the extent of the 
linear i 


Charles D. Spielberger 


s 
score equal to the ag improvement for hi 
group is given in Table 4. ; a A 
Š althouck the findings reported oe 
per do not directly challenge the vali orate 
the MAT as a predictor of success in ae 
ate training, the application a one! 
ment to selection problems, especially i nies 
where cutoff scores are employed, aie 
that practice effects be taken into a ni 
The relative validities of an applican ‘aan 
tial and second MAT scores as ae 
his academic potential must be sic ihe 
in order to optimally utilize the MA Bat 
selection of Psychology graduate students 


Summary 


by- 

This study was designed to ee 

pothesis that scores on the Miller of 20 

Test improve with practice. A group all of 
first-year Psychology graduate students, 

whom had nt cee cs an 


nate form of the tet. Sevemtecm of thest 


* se 
M™proved their scores and the mean increas 
in score for the group l their 
he reason most frequently given ee 
improvement by these Ss was that they 
what to expect” 
with the test, on 4 
fect on the MAT was cross-validated 
second sample 
dents who also 


. con- 
provement in score on retest and further 


pe inde- 
firmed by a similar finding on a third in 
pendent group 


chology honors 


sn iRcarite 
was highly significan! 


¿perience 
in their second a ae 
This finding of a practi 


stu- 
of psychology graduate a 
showed a significant mea 


* sy~ 
of undergraduate wee 
students. Of a total of eg 
Improved. The improvement in scor 


Peared to be unrelated to the particular oa 
ternate forms of the MAT employed. In ini- 
der to evaluate the relationship between 
tial MAT score and imp yere 
retest, the data from the three samples Y 

combined, 

was found t 
initial Score 


i on 
rovement in score 


t 
š men 
The magnitude of eae 
© be inversely related ta rove" 
on the test. Maximum imp 


he 
ment in scores occurred for that range of fe 
MAT which might be considered most 
Portant from the st 
dent selection, 


stu- 
andpoint of graduate 


Received October 10, 1958. 


| 
] 


Practice Effect on the 


References 


Cureton, E. E., Cureton, Louise W., & Bishop, Ruth. 
Prediction of success in graduate study at the Uni- 
versity of Tennessce. Amer. Psychologist, 1949, 
4, 361-362. 

Fahey, G. L. Discriminatory capacity of the Uni- 
versity of Pittsburgh Examination among graduate 
students in psychology. Amer. Psychologist, 1953, 
8, 204-206, 

Jensen, R. E, Predicting scholastic achievement 
of first-year graduate students. Educ. psychol. 
Measmt, 1953, 13, 322-329. 


Miller Analogies Test 263 

Kelly, E. L., & Fiske, D. W. The prediction oj per- 
formance in clinical psychology. Ann Arbor: Uni- 
ver. Michigan Press, 1951. 

Lindquist, E. F. Design and analysis of experiments 
in psychology and education, Boston: Houghton 
Mifflin, 1953. 

Miller, W. S. Miller Analogies Test, Manual. New 
York: Psychological Corp., 1952. 

Moore, B. V. Educational facilities and financial 
assistance for graduate students in psychology: 


1958-1959. Amer, Psychologist, 1957, 12, 626- 
647. 


urnal of Applied Psychology 
ls No. 4, 1959 


TEAM PRODUCTIVITY AND CONTRADICTION OF 
MANAGEMENT POLICY COMMITMENTS ' 


HAROLD B. PEPINSKY, PAULINE N. PEPINSKY 


Ohio State University 


FRANK J. 
IBM, Endicott 


MINOR 


» New York 


asp STANLEY S. ROBIN 


Purdue University = 


A bureaucratic social organization contains 
units of members whose functions are spe- 
cialized, and communication between units 
is narrowly channelized. Typically, 
managing officer has the most author 
make operating decisions for the organi 
Working under him in hierarchically 
lines of authority are successions of 
who have smaller and smaller amounts of au- 
thority, until—at the bottom of the heap— 
workers whose authority is limited to the 
performance of particular, assigned organiza- 
tional tasks are found. As top management 
modifies its operating decisions in response to 
changing external conditions, task conditions 
can be expected to vary for workers at the 
bottom of the organizational hierarchy (Gerth 
& Mills, 1946, pp. 196-224) 

Conditions within an industrial plant and 
their Consequences for worker 
have served 


a top 
ity to 
zation. 
ordered 
persons 


a group task, such that A 
the task performance of B, 
pointed to coordinate the 


1 Research conducted under Contract 
T.O. III (NR171-123) between the Office 
Research (Group Psychology Branch) 
State University Research Foundation, This paper 
is a condensation of a detailed technical report 
(Pepinsky, Pepinsky, Minor, & Robin, 1957), which 
may be obtained on loan from the Gifts and Ex- 
change Department of the Ohio State University 
Library. . o i 

* Formerly at the Ohio State University, 


N6ori-17, 
of Naval 
and the Ohio 


264 


In the experiment, these are represented y 
a vice-president, a team member appointed 
as department head, and other members of 4 
work team, respectively. Relationships m 
a fourth person, D, are also required, a 
that B, the department head, must conduc 
transactions with D in the performance of the 
group task. Depending upon which role 3s 
appropriate at a particular time, D is desig- 
nated as either supplier or buyer. All trans- 
actions between the work team and the vice- 
president or supplier and buyer must be pan 
ducted for the team by their departmen 
head, and all transactions between depart- 
ment head and supplier must have prior a 
proval by the vice-president. (b) The ex- 
perimental task is defined as one that ee, 
Coordinated team effort and yields quanti! 
able and reliable measures of task pertot i 
ance. (c) The vice-presidents commitmen 
to the department head is defined as the pi 
president’s advance statement of the sanction 
he will give to any prospective transaction be 
tween department head and supplier: ‘ch 

The confirmation condition is one in whic 
the vice-president, by his subsequent action 
corroborates his prior commitment: he o a 
advance notice of his intent to approve or dis 
approve and behaves accordingly. sah 

The Contradiction condition is one in ie 
the vice-president. by his subsequent ng 
fails to corroborate his prior commitment; 4S 
gives no advance notice of his jntentye o 
either to the team or to their departme™ 
head, and his approval or disapproval is bas 
upon information not available to them- m 

Under the confirmation condition, S an 
is able to predict in advance the presiden 
approval or disapproval of transactions is 
tween department head and supplier. 


Productivity and Contradiction of Management Policy 265 


thus assumed that the confirmation condition 
elicits responses that are compatible with 
those appropriate to successful task accom- 
Plishment. Under the contradiction condi- 
tion, however, a team is not able to make ad- 
vance predictions of what the vice-president 
Will approve or disapprove. Since the latter 
Condition is assumed to arouse stereotyped 
Or otherwise incorrect responses that compete 
With responses appropriate to successful task 
Completion, it is hypothesized that produc- 
tivity will be lower under the contradiction 
Condition than under the condition of con- 
firmation, 


Method 


Eighty white, male Ss, enrolled in the introductory 
Psychology course at the Ohio State University, had 
Volunteered for the study to mect a partial course 
requirement, Each S was assigned to a four-man 
eXperimental team on the basis of scheduling con- 
venience and an effort to avoid placing in the same 
Stoup Ss who were already well acquainted with 
cach other, J 

The experimental task was a toy manufacturing 
Problem (adapted from Hemphill, Pepinsky, Kauf- 
sam & Lipetz, 1957), which required a team of four 
to operate the toy model production department 
t a small factory, Their assignment was to buy 
inker toy parts for different kinds of toys, assemble 
the toys, and sell them. One of the team, selected 
at random, was appointed initially as department 
head. Team members worked in the laboratory at 
n on table, placed in front of a one-way mirror, 

rough which they could be observed. In the cen- 
sit the room stood two tables for an E, who 
c ved at one time as supplier of parts and, as 0c- 
‘sion demanded, as buyer of assembled toys. Be- 
whe these tables was a screen, on the other side of 
ition the vice-president in charge of production, 

Th ca E, sat at his desk. . erii 
Sisi team had three consecutive work ear F i 
ow, isting of an initial 5-min. planning period, fo 
dwar by a 20-min. work period in the first es 
thirg y 25-min. work periods in both the a 
the Sessions. No production work was allowee i 
the Planning periods, but during the work period s, 

team was to order parts for, assemble, and sell 

YS in order to realize the maximum profit within 

® allotted time, Five toy models were displayed 
EVer e team’s shop, and, prior to each work sean 
cose” team member was given documents lisne 5 
Dri of parts for each kind of toy and the selling 
P ce for which each completed toy could be sola, 
to th Costs and selling prices were varied agin 
Ver, © complexity of the toy. Costs and Bec me 
the a changed from session to session; De 

art margins also changed. ; naaa 

€ each work session, the team was given ae 

coupe eer chips with which to buy parts. — Parts 
time. be ordered for only one kind of toy at a 
3 the Purchase order form had to be signed by 


the team and the department head and accompanied 
by sufficient funds to cover the purchase of parts. 
Next, the department head had to take order form 
and funds to the vice-president for his signed O.K. 
of the order. The approved order was then filled by 
the supplier, and the toys were assembled back at 
the shop. Completed orders were turned over by 
the department head to the buyer, who purchased 
the toys if they were correctly assembled. All trans- 
actions between team, vice-president, supplier, and 
buyer were conducted solely by the department head, 
the only person allowed to enter the vice-president’s 
office. 

The first work session was a control condition, 
during which all teams received identical treatment: 
the vice-president approving all correctly signed pur- 
chase orders submitted to him. 

The contradiction condition was established for 
half of the teams during the second and third work 
sessions. Under this condition, the vice-president ap- 
proved or refused to approve, according to a prede- 
termined pattern, purchase orders submitted to him 
by the department head. Purchase Orders No. 1, 4, 
6, 9, 11, etc. were automatically approved if correctly 
signed. Purchase Orders No. 2, 3, 5, 7, 8, 10, 12, etc. 
were not countersigned by the vice-president; in- 
stead, for each of these orders a note was given to 
the department head, to be read aloud to his team 
and informing them that changing market conditions 
had necessitated a temporary suspension of produc- 
tion on the kind of toy ordered. An £ simply 
stopped the clock when the team worked on disap- 
proved orders, however, and the team’s actual work 
time during Sessions 2 and 3 included only that 
spent on approved orders. Although production sus- 
pensions were lifted according to a predetermined 
pattern, team members were never told how long a 
suspension would last. To minimize the likelihood 
of correct predictions by teams run under the con- 
tradiction condition, a promised budget increase and 
carry over of net profit from the second to the third 
work session did not materialize. 

The confirmation condition was maintained through- 
out the second and third work sessions for the other 
half of the teams. Here, too, a predetermined sched- 
ule of temporary work suspensions was used by the 
vice-president as a guide in approving or disapprov- 
ing purchase orders submitted to him. Under this 
condition, however, the toys to be suspended and 
their times of suspension were clearly specified at 
the beginning of each work session, in the form of 
oral and typed announcements, to every team. To 
arrive at the pattern of suspensions for a given team, 
each was paired with a preceding team run under 
the contradiction condition, The empirical record 
kept for the contradiction team was used to specify 
an identical pattern of suspensions for its paired 
confirmation team. Because it was desired to maxi- 
mize the likelihood of correct predictions by teams 
run under the confirmation condition, team members 
under this condition were always correctly informed 
about their future budget allocations. 

In the experiment, 20 four-man teams were di- 
vided into 10 consecutive team pairs. Ten teams 


266 


were run under the contradiction condition and ten 
under the confirmation condition. 


Results 


Several checks were made on the experi- 
mental procedure. First, a postsession ques- 
tionnaire and interviews clearly established 
that the confirmation teams could predict and 
the contradiction teams could not ever learn 
to predict what the pattern of Suspensions 
would be during the experimental sessions. 
While a majority of both groups claimed that 
their task motivation was increased by the ex- 
perimental conditions, a significantly greater 
proportion of the contradiction team mem- 
bers reported that their teams became more 
disorganized and less efficient when they 
could not anticipate temporary production 
suspensions. Yet neither group of Ss re- 
garded themselves as having been punished 
during the experiment. These results suggest 
that teams under the contradiction condition 
were not less motivated to perform the experi- 
mental task nor more negatively reinforced in 
performing it than teams under the confirma- 
tion condition; the contradiction condition, 
however, did seem to elicit a greater number 
of responses that were extraneous to efficient 
task performance. 

Second, a check upon the task motivation 
effects of the confirmation and contradiction 
conditions was made by correlating amount of 
profit per order with 5-min, work periods 
within each session for teams run under each 
condition. Positive correlations for both 
groups were anticipated for all three sessions; 
the correlations for the Confirmation groups 
should have been significantly greater than 


H. B. Pepinsky, P. N. Pepinsky, F. J. Minor, and S. S. Robin 


the correlations for the contradiction groups; 
however, only if profit were to be viewed as 
a greater incentive for the confirmation group. 
All correlations were significantly positive and 
increased slightly from session to session, but 
correlations for teams run under the two con- 
ditions do not differ significantly from each 
other. This result supports the view that the 
task motivation effects of the two experi- 
mental conditions were similar. 

A third procedural check was made to de- 
termine whether the pattern of temporary 
production suspensions was comparable for 
the ten teams within each experimental con- 
dition. Because profit margins differed 
the various toy models, the experimenta 
stimuli could be regarded as comparable 
within a condition only if restrictions nee 
the production of each model were eg 
distributed among the ten teams. The any 
was made by plotting the frequency mE 
which purchase orders for the five kinds 0 
toys were disapproved for each of the ten rk 
tradiction teams during the second and thie 
work sessions. A chi-square test of wg rt 
tion between toy Suspension and team 10 i 
cates that these variables are independen 
Hence, productivity scores for the ee 
within each condition could be pooled f 
every session, and direct comparisons aE 
be made between the pooled scores of t 
confirmation and contradiction groups. T 

Tests now could be made of the ge 
mental hypothesis: that team productivity 
would be greater under the confirmation m 
under the contradiction condition. In pe 
ing these tests, it was predicted that the tY 


Table 1 


Team Pair Differences in Net P 
(Confirmation team score — 


rofit 


Contradiction team score) A 
e Session — 
1 2 3 2&3 = 
Mean difference —.33 83 1.24 2.07 
sess, 1 diff. —.95 we ah — 
Regression coefficient —.03 —.18 —.21 
freer. coeff. —.14 —.66 =.59 
Experimental (residual) variation 82 1.18 2.00 
lexper. var. 3.36** 4.21** sá _ 
Note—Minus sign indicates i 
ee 


Significant at .01 level, 


confirmation scores less than contradiction scores, 


Productivity and Contradiction of Management Policy 267 
Table 2 
‘Team Pair Differences in Net Profit Per Minute of Actual Work Time 
: (Confirmation team score — Contradiction team score) 
Session 
à l 2 3 2&3 
Mean difference —.02 01 03 02 
fsess. 1 diff, —.95 = aaa = 
Regression coefficient —.02 —.12 —.06 
bregne. —.09 AS 40 
Experimental (residual) variation 01 -03 02 
fexper. var. 1.02 2.47* 2.46* 
Note.—Minus sign indicates confirmation scores less than contradiction scores. 


* Significant at .05 level. 


Sets of teams would not differ significantly in 
their productivity during Session 1 (the con- 
trol session), but that they would differ sig- 
nificantly in Sessions 2 and 3 (the experi- 
mental sessions). The results of comparing 
the contradiction and confirmation groups in 
terms of the teams’ net profit accumulated in 
each work session are shown in Table 1. The 
Mitial, Session 1, difference scores are not sig- 
nificantly different from zero, nor are the re- 
8ression coefficients of the Session 2 and 3 
Scores on Session 1 scores. In every case, 
Owever, the variation attributable to the 
€xperimental conditions is significant: the 
amount of net profit earned by the confirma- 
tion teams is significantly greater than that 
€arned by the contradiction teams in Sessions 
and 3 and in the combined sessions. For 
Net profit, then, the stated prediction is sup- 

Ported by the data. 
More rigorous test of the hypothesis is 
Provided by comparing the performance of 
confirmation and contradiction teams on the 
asis of their net profit per minute of actual 
Work time. These results are reported in 
able 2, Again, the mean difference between 
1€ scores of team pairs is not significantly 
ifferent from zero. The mean differences be- 
a the scores of team pairs that can be at- 
si uted to the experimental conditions are 
a ificant in Session 3 and in the combined 
“cond and third session, but not in Session 2. 
înce the combined Session 2 and 3 scores are 
A i. reliable than those of either session 
Gea the results are interpreted as support- 
thee he hypothesis. It may be noted, though, 
at oe obtained differences are significant 
e .05 level of probability, whereas for 


the net profit scores the differences were sig- 
nificant at the .01 level. 


Discussion 


In summary, the experimental results are 
consistent with the hypothesis: the confirma- 
tion teams were more productive than the 
contradiction teams, both in net profit earned 
during the experimental sessions and in net 
profit per minute of actual work time. A sub- 
sidiary prediction in respect to net profit per 
number of completed orders was not reliably 
supported by the data, although even here 
the trend was in the expected direction. 

The experimental results are interpreted 
not only as providing support for the central 
hypothesis, but as lending credence to its un- 
derlying rationale. Specifically, it can be in- 
ferred that the contradiction condition lowers 
team productivity (a) because it arouses re- 
sponses that are not appropriate to the suc- 
cessful completion of the task and (b) be- 
cause the occurrence of these responses op- 
erates to reduce the frequency with which 
responses appropriate to the task can occur. 
While this rationale has internal consistency, 
it is also given empirical support by the ob- 
served behavior of the Ss in the experiment. 
Under the confirmation condition, with fore- 
knowledge of what production suspensions 
were to occur, there were but rare occasions 
in which a team seemed to have difficulty in 
adjusting from the operating freedom of Ses- 
sion 1 to the production restrictions of the 
second session. Under the contradiction con- 
dition, however, the Ss seemed to experience 
considerable difficulty in adjusting to the new 
and unpredictable state of affairs. This was 


268 H. B. Pepinsky, P. N. Pepinsky, F. J. Minor, and S. S. Robin 


manifested in many ways: e.g., trouble in co- 
ordinating activity for the building of new 
toys; time spent in trying to predict what new 
orders the vice-president would disapprove; 
joking about the actions of the Es; hostility 
toward the supplier, who was seen as being 
too slow or as short-changing the team in 
money or supplies; filling out orders incor- 
rectly; or in the team’s becoming immobi- 
lized. We can interpret these actions as “ir- 
relevant” in the sense that they do not fa- 
cilitate successful task completion. It should 
be kept in mind that when such irrelevant 
action occurred during the discussion of dis- 
approved orders, it was not counted against 
the actual work time of a contradiction team. 
In one form or another, however, irrelevant 
action occurred frequently enough during the 
processing of approved orders to cut down 
materially on the profit-making activity of 
the contradiction Ss. 

Alternative rationales might have yielded 
similar predictions, e.g., those provided by 
students of statistical uncertainty effects 
(Hake, 1955; Macy, Christie, & Luce, 1953), 
of anxiety (Brown & Farber, 1951; Pepinsky 
& Pepinsky, 1954; Taylor, 1956), and of am- 
biguity intolerance (Blake & Ramsey, 1951). 
Indeed, a more extended statement of the 
Present rationale would indicate its indebted- 
ness to all of these contributions. A major 
implication of the experiment is that what 


be predicted for 


is a major 
criterion of group productivity, (b) where 
bers is re- 
task, and 
within a 


quired for success on an assigned 
(c) where the team must function 
hierarchical structure in which communica- 
tion is narrowly channelized, It must be kept 
in mind, however, that in this study (and in 
most other small group experiments) team 
members performed together on one occasion 
only. The effects of these conditions main- 
tained over extended time might be increased 
or minimized by compensatory or adaptive 
responses of the team. 


Summary 


A simulated small industrial plant was the 
setting for an experiment in which a team of 


Ss worked together on a manufacturing prob- 
lem. Their assigned task was to produce dif- 
ferent kinds of toys at a profit. Team pro- 
ductivity, the dependent variable, was opera- 
tionally defined as the amount of net pan 
earned by the team. A three-level hierachica 
sroup structure was used in which all a 
actions between the team and a puso 
or a supplier and buyer had to be condici 
by an appointed department head, and ra 
supply orders required prior approval of t 
vice-president. ; 
Teeny four-man teams were divided into 
ten consecutive team pairs, each member of a 
pair being subjected either to (a) a condition 
under which the team’s expectations of see 
agement were contradicted by sereniges 
events or (b) a condition under which t 4 
team’s expectations were confirmed. The a 
pothesis that team productivity would 
greater under the confirmation condition was 
supported by the data. Some theoretical a 
plications of the experiment were suggested. 


Received October 13, 1958. 


References 


Blake, R. R., & Ramsey, G. V. 
an approach to personality, 
Press, 1951. tual- 

Brown, J, S., & Farber, I. E. Emotions concep to- 
ized as intervening variables—with suggestions osy, 
ward a theory of frustration. Psychol. Bull, 19° 
48, 465-495, cure 

Hake, H. H, The perception of frequency of o% hu- 
rence and the development of “expectancy (ed): 
man experimental Subjects, In H. Quastler ( ; 
Information theory in psychology. Glencoe, 
Free Press, 1955, Pp. 257-277. <ayfmans 

Hemphill, J. K., ‘Pepinsky, Pauline N., Kau 


: pis 
ALE, & Lipetz, M. E, The effects upon ane 


ceptions 
Eds.) Percepli 
A York: Ronald 


to lead of task motivation and expectancy 1051, 
Te elishment of the task, Psychol. Monog» 

1, No. 22 (Whole No, 451). ing 
Macy, J. M., Christi; L. $. & Luce, R D. er 
noise in a task oriented group. J. abnorm. ` 
Psychol., 1953, 48, 401-409, nsel- 

Pepinsky, H, B., & Pepinsky, Pauline N. Cows 


h , ef nala 
mg: Theory and practice. New York: RO 


Press, 1954, F. Je 
Pepinsky, H. B., Pepinsky, Pauline N., Minor, dual 
& Robin, S. S. Motivational factors in igav as 
and group Productivity: VI. Team productiv? na i 
related to confirmation or contradiction by ‘nted 
ysement of its commitments to an appo 57. 
leader. Ohio State Univer, Res. Found- 
(Multilith.) 
aylor, Janet, 
Psychol, Bull., 


a xiety- 
Drive theory and manifest an* 
1956, 56, 303-320. 


S 
—————— 
——— 


coe 


p 


Journal oj Applied Psychology 
ol. 43, No. 4, 1959 ge 


THE USE OF CRITICAL INCIDENTS IN A FORCED- 
CHOICE SCALE 


BRIAN R. KAY 


University of New Hampshire 


In recent years there have been two de- 
velopments in the field of evaluation of per- 
formance: the appearance of the critical inci- 
dent technique proposed by Flanagan (1954) 
and the forced-choice type scale described by 
Sisson (1948). A preference for objective 
description of behavior was at the root of the 
former, while in the latter a scale where a 
rater’s enthusiasm for or against the man 
Would no longer have free reign was the ob- 
Jective. Sisson (1948), in listing the assump- 
tions of the forced-choice method, states that 
differences in competence or efficiency can be 
described in “objective, observable items of 
behavior,” However, the forced-choice scale 
Produced by the Personnel Research Section, 
TAGO, includes such items, without elabora- 
tion, as “egotistical,” “nervous,” “easy-going,” 

Cool-headed,” and “anti-social.” Are not the 
Meanings of these words open for discussion 
m any group? 

It seemed reasonable to consider combining 
the strengths of both methods; to form tet- 
"ads of critical incidents in a forced-choice 
Scale Covering the areas of behavior in which 
Men were to be evaluated. Clearly, because 
an incident is cited by someone as being 
illustrative of critically effective or ineffective 

havior, it does not mean that it is neces- 

“arily held to be so by others. Fortunately, 

mn Process of arriving at an index of dis- 

o Ìmination allows us to ascertain the degree 
Consensus, 


Procedure 


T evaluation of the performance of — 
de the objective, and the location, a manu acturing 
partment of a plant employing approximately 500 
Visio; After a general briefing of all levels of ol 
conf from assistant foremen to the plant manager, 
A dential interviews were conducted by the writer. 
dr; random sample of nonsupervisory personnel m 
Dartm from all members of the manufacturiñg = 
leve] ent, while all incumbents of the otlier e 
Perso, Were interviewed. No guidance was given i 
the being interviewed other than a definition 
Material sought, namely, critical incidents de- 


269 


scribing something outstandingly effective or ineffec- 
tive that a foreman had done. On completion of 
the description of the incident, the substance was re- 
corded. An effort was made to disguise the origin 
of the information and to whom it applied, while 
retaining the significance of the incident. Each re- 
corded incident was then checked with the informant 
for accuracy and clarity. The total number of inci- 
dents gathered was 691, with representation from all 
levels as presented in Table 1. 

An analysis of the 691 incidents reduced the num- 
ber to 337 on the grounds of overlapping, duplica- 
tion, etc. The degree to which the behavior de- 
scribed in the incidents actually applied to foremen 
was determined. The key was the same reported 
by Sisson (1948): 


1. Exceedingly high or highest possible degree 
. To an unusual or outstanding degree 

. To a typical degree 

. To a limited degree 

. To a slight degree or not at all 


wb wh 


Superintendents and supervisors (13) were requested 
to think of three foremen that they knew very well 


who were, respectively: 
1. Outstandingly effective in over-all competence 
2. Average, no worse than nor better than most 
in over-all competence 
3. Least effective in over-all competence 


The foremen’s superiors proceeded through the 337 
mimeographed incidents, indicating for each the ap- 
plicability of the item to each of the foremen repre- 
senting the three levels of competence. The stand- 
ard procedure for computing the preference and dis- 


Table 1 


Sources of Incidents 


M Number 


Sum of per 
Group N Incidents Person 

Supervisors 
Superintendents 15 199 13 
Managers 
Foremen and 
ass’t foremen 34 309 9 
Nonsupervisory 

25 183 7 


personnel 


270 


imination indices was used (Sisson, 1948). The 
ae e of the discrimination values, 
plotted on probability paper, yielded a straight line. 
Pairs of items were then selected, one having migi 
mal discriminative power between effective an n 
effective foremen and another minimal, but ae 
in preference value to the first. The experimental 
forced-choice scale was composed of twenty pairs of 
effective and thirty pairs of ineffective behaviors. 

The paired-comparison method of ranking foremen 
was used as a criterion which on a test-retest assess- 
ment (month apart) yielded reliabilities of .93, 83, 
and .76 for the three sections with which we were 
concerned. The foremen were rated by their im- 
mediate supervisor and superintendent, with each 
acting independently of the other. Both superiors 
had sufficient contact with their foremen to justify 
this procedure. The criterion was thus the mean 
rank orders assigned by these two men for the two 
occasions, The same supervisors and superintend- 
ents three months later completed the forced-choice 
scale on their own foremen. To test the validity of 
the experimental scale, a rank order Coefficient was 
computed to assess the degree of relationship be- 
tween the mean scores of the forced-choice scale 
and mean paired-comparison scores, 


Results and Discussion 


None of the relationships between rank or- 
ders based on forced-choice total scores and 
the paired-comparison rankings was found 
even to approach significance in any of the 
three sections. The correlation between the in- 
dividual supervisor's paired-comparison rank- 
ing and the scores achieved by his men on 
the forced-choice scale also was not signifi- 
cant. 

Following Kelley (1939), the discrimina- 
tive values of all items on the scale were ana- 
lyzed for the top and bottom 27% of fore- 
men rated by the paired-comparison method. 
The validity of each item was determined by 
reference to Flanagan’s (1939) table as given 
in Thorndike (1949, Appendix B). This 
analysis yielded only 7 significant items of 
the 50 that were discriminating according to 
the original assessment. The results were 
convincingly discouraging and no further 

analyses were undertaken. 

One interpretation of the results 


might be 
that in the establishment of the crit 


erion, the 
rater could allow his partiality free rein, 
whereas the forced-choice scale had been 


successful in preventing this from occurring, 
This is rejected, however, as the scores on the 
forced-choice scale for all foremen showed a 
mean of 26 and a standard deviation of 23. 


Brian R. Kay 


+ scale is not 
This leads us to believe that the scale bs r g 
discriminating, for the most likely value, 


; 5. 
- chance alone were operating, would be 2 


The raters, when interviewed on their 
actions to the use of the scale, threw pee 
light on the results. A common reaction or 
their claimed inability to actually deci ; 
which of the two alternative behaviors ae 
most or least the man being rated. Howevai 
it will be recalled that, on arriving at an did 
dex of discrimination, the rater actually ee 
assess the degree of applicability of the ad 
havior to the most effective, average, t 
least effective foreman he knew. It mieh ire 
posited that the success in one and the fai a 
in the other arises from halo effect vith 
anonymity in the first as compared W 
identification of the rated in the latter. ai 

However, a simpler explanation ae ‘aii 
attention. The distribution of discrimina z 
indices for the 337 incidents carer OT 
very closely to the normal probability a 
We are forced to conclude that the, a 
where maximal and minimal discrimioa a 
indices were obtained were chance va ou 
and our judges throughout actually ber 
Capable of assessing the degree of likeli a 
of effective, average and ineffective ee 
doing that which was described in the a dy 
incident. The results obtained in this $ aae 
therefore would seem to discourage the fe da 
bility of using the level of specificity prov! 


â ite the 
_ by the critical incident technique, despite 


sar for 
fact that objective description of ee 
many people has preference over inferre 


eth- 
sonal characteristics in contemporary ™ 
odology, 


Received October 20, 1958. 


References 


A o selec- 
Flanagan, J. C. General considerations in imal 
tion of test items and a short method of he data 
ing the Product-moment coefficient from psychol 
at the tails of the distribution. J. educ. P9) 
1939, 30, 674-680, 
Flanagan, J. C. The critical incident ti 
Psychol. Bull., 1954, 51, 327-358. d lowe" 
Kelley, T. L. The selection of upper ak edut- 
groups for the validation of test items. <7: 
Psychol., 1939, 30, 17-24. yal 
Sisson, E. D. Forced choice—The new army 
Personnel Psychol., 1948, 1, 365-381. 
Thorndike, R. L, Personnel selection. riley’ 
measurement techniques. New York: Wiley: 


echnique 


ting- 


$ and 
Tes 164 


a ee E 


Journal of Applied P. 
Val 3, Kosh 1980 0? 


DIFFERENCES BETWEEN WELL AND POORLY 
ADJUSTED GROUPS IN AN ISOLATED 
ENVIRONMENT ' 


LEO R. EILBERT anp ROBERT GLASER? 


American Institute for Research, Pittsburgh, Pennsylvania 


Ia present study is concerned with the 
e N of possible predictors of personal 
A eenei to conditions of Arctic isolation. 
interest erable number of persons serving the 
ent 5 ' of government and industry are cur- 
peas y "pig in such geographically remote 
PaRa ca areas. With the advent of space 
will A this trend toward isolated living 
Be oubtedly be further augmented. 
in ate the survival problems posed, life 
Bilin Arctic can pose serious adjustment 
H ems for the individual (Eilbert, Glaser, 
sane 1957). The Arctic environment is 
Sie ive and characterized by deprivation. 
is tar ai portion of the year, living 
te ae confined to indoors. The environ- 
Social eprives the individual of many familiar 
ma i; presents him with repetitive 
Slice — to the point of satiation, and 
Recents im in a situation that allows him 
tion tally no privacy. Under such condi- 
ay it is not surprising to find that morale 
e efficiency are often adversely af- 
be fels Finding leisure activities which can 
acti ated to personal development and satis- 
e represents a major problem (Air Site 
e Staff, 1952; Air Site Project Staff & 
1957)’ 1952b; Hilbert, Glaser, & Hanes, 
oat objective of this study was to identify 
ech bles for the development of selection 
ihe ee to minimize the number of per- 
ret adjustment problems of men at isolated 
Wer ic military bases. The Ss of the study 
€ 648 enlisted Air Force personnel as- 
1 > 
Sta eis research was supported in part By RF Du 


onitan Force under Contract No. € 
Perso red by the Personnel Laboratory, Air Force 
Ai Onnel and Training Research Center, Lackland 
Teproduess Base, Texas. Permission is granted for 
Dosa} uction, translation, publication, Use, and dis- 
States in whole and in part by or for the United 
? Th Government. 

Murray m Da wish to express 
Ours 


their appreciation to 


lanzer for his valuable assistance jn the 


© of the study. 


271 


signed to eight Arctic bases. The ages of 
the men in these groups ranged from 18 to 
47 years, with a median age of 20 years. 
They had been in their present isolated sta- 
tion from 2 to 12 months, with a mean of 
7 months. Seventy-six per cent were enlisted 
airmen and 24% were noncommissioned 
officers. 
Procedure 

The working definition of Arctic adjustment 
adopted was effectiveness on the job and the ability 
to get along with others. The measure of ability to 


function in the Arctic was rating by immediate 
These supervisors nominated the best 


supervisors. 
and most poorly adjusted men in their section or 
detachment. A score of plus one and minus one was 


assigned to positive and negative nominations on 
each of nine items of a supervisor rating form, and 
ach section were classified on the basis 
of their total scores. Means and standard deviations 
were computed for these supervisor nomination scores 
by section. Men whose scores were more than one 
sigma above or below the mean for their section 
were selected for inclusion jn the “well adjusted” 
(high) and “poorly adjusted” (low) groups. The 
high and low groups consisted of 112 and 83 men, 
respectively. 

The two groups were compared for differences in 
the general areas of personal background, person- 
ality characteristics, and medical complaints. Some 
of the instruments used were exploratory and some 
were selected because they had shown promise in 
previous research (Sharp & Harper, 1953; Sharp, 
Goldstein, & Bolanvich, 1954; Stunkel, Tye, & 
Yaukey, 1952). The following survey and test 
instruments were used: 

1. Biographical Inventory. The 150-item inven- 
tory used contained items that tapped the following 
areas: military history, employment yecord, educa- 
tional background, family background, organizational 
membership, friendship patterns, marital history, 
sports and hobbies, personal characteristics, and aspi- 


rations and plans. 


2, Self-Appraisal Blank. 
forced-choice quintets of descriptive adjectives and 


phrases. The men were asked to record which item 
of each quintet was most descriptive and which was 


least descriptive of themselves. 
3, Incomplete Sentences Test. The form used 


contained 70 items and was based largely on the 


the men of e; 


This consisted of 42 


i) 
~ 
vw 


Holsopple and Miale Test (1952) and the Incomplete 
Sentences Test for Pilots (American Institute for 
Research, 1953). 

4. Medical Symptoms List. This was a check list 
of 68 items representing a relatively comprehensive 
coverage of medical complaints. 

5. Anxiety Scale. A slightly modified version of 
the Taylor Manifest Anxiety Scale (1953) was used. 

6. Food Aversion List. Men reported their dis- 
likes to items in a list of foods. This list was based 
on findings of previous investigators (Altus, 1949), 
suggesting that aversions to these foods are related 
to neuroticism. 

7. General Information Test. Earlier studies 
(Sharp & Harper, 1953; Sharp, Goldstein, & Bolan- 
vich, 1954) had indicated that certain types of 
information were related to Arctic adjustment. 
Forty multiple-choice items sampled automotive in- 
formation, sports, literature, and art, 

8. Peer Nomination Form. This was composed of 
ten items closely resembling those used in the super- 
visor ratings. Men were asked to nominate the best 
and poorest men in their sections in resp 
questions about job proficiency, ability to g 

with others, and general adjustment to th 
and to the Air Force. 


onse to 
et along 
e Arctic 


In addition to these measures, Air Force aptitude 
test and job Proficiency test scores were obtained 
for the men in the two groups. Medical record 
data for these men showing the number of sick call 


visits and the number of hospitalizations were also 
obtained, 


Results 

Since the pur 
exploratory, 

cance used f 

stringent, 


ences was the 5% 
10% level for test items. 
ferences in the mean scor 
low criterion groups were found for the Medi- 
cal Symptoms List, Anxiety Scale, Food Aver- 
sion List, or General Information Test. The 
salient findings of Survey instruments which 
were found to differentiate the criterion grou 
can be summarized as follows: sia 
Biographical Inventory, Analysis consisted 
of chi-square comparison of the high and low 
groups’ response choices to each item. Thirty 
of the 150 items were found to yield differ- 
ences between the criterion groups that were 
Statistically significant (p < 10). The per- 
sonal history characteristics that were found 
to differentiate members of the poorly ad- 
justed group were: urban background, rela- 
tively high socioeconomic background, and a 


es of the high and 


Leo R. Eilbert and Robert Glaser 


history of minor infractions of military rules 
and regulations. A statistically significant 
nonlinear relationship was found between the 
measure of adjustment and the age at which 
independence from family was achieved, i.e- 
having their own money, buying their own 
clothes, and going on dates. Men who re- 
ported this independence at relatively young 
or old ages were more prone to be in the 
poorly adjusted group. 

Self-Appraisal Blank. Scoring was based 
on an Arctic key which had been empirically 
derived by previous investigators (Sharp et 
al.: 1953, 1954; U. S. Army, The Adjutant 
General's Offce, 1949). Significant differ- 
ences between the mean scores of the high 
and low groups were found (¢ test, p < 01): 
In general, members of the well adjusted 
group tended to describe themselves as Con- 
scientious and responsible individuals wh? 
accept rather than resist authority. The men 
who were judged to be poorly adjusted tended 
to describe themselves in other, less consistent 
terms. 

Incomplete Sentences Test. This test was 
used to investigate the personality and atti- 
tudinal characteristics that might differentiate 
the well adjusted from the poorly adjuste? 
group. Specific subtest areas were: attitude 
toward Arctic assignment, fears and com 
plaints, attitude toward work, interpersonā 
and family relationships, moral and sexua 
attitudes, and goals and aspirations. A bee 
4 priori rationale prepared for each item an4 
developed on a holdout sample of 100 was 
used as the basis for scoring. Using these 
rationales, responses which were consistent 
with good adjustment were scored two; 1 
determinate responses were scored one; 
SPonses suggestive of poor adjustment Wel 
Scored zero. Significant differences betwee? 
the mean scores of the high and low groups 
Were found (¢ test, p< .01).* Of the w 
groups, the member of the poorly adjust? 
&roup was found to do more complaining, 
more fearful of the Arctic, have greater di 
culties in his interpersonal relationships, | 
less inclined to do better than marginal w : 
and be more concerned about the possibili y 

* Since, in this analysis, the direction of the Ai 


ference between the two groups was hypothesi 
one-tailed test was used. 


ig 


ee 


Adjustment in an Isolated Environment 


that his Arctic assignment would disrupt his 
relationship with his wife or girl friend. 

Peer Nomination Form. The scoring pro- 
cedure for the Peer Nomination Form was 
similar to that described for the supervisor 
rating form. Significant differences in the 
distributions of peer nomination scores for 
the high and low groups were found (Kol- 
Mogorov-Smirnoy test, p < .001). Men who 
Were identified by their supervisors as being 
Well adjusted to the Arctic were also likely 
to be considered well adjusted by the other 
Men. Conversely, men identified by the super- 
Visors as being poorly adjusted were also 
likely to be so judged by the other men. 

„Job Proficiency Tests. Air Force profi- 
Clency test scores were used to compare the 
high and low groups. The mean score of 
men judged to be well adjusted was found to 
€ significantly higher (¢ test, p < 001) than 
that of the poorly adjusted group. 

ti Aptitude Scores. Aptitude scores were ob- 
ined from each airman’s personal records. 
$ hese scores were based on the tests included 
ay the Airman Classification Battery and the 
Airman Qualification Examination. An aver- 
age aptitude test score was computed for each 
man. The distributions of the mean aptitude 
€st scores of the two criterion groups were 
Compared, The difference between the dis- 
tributions of mean aptitude scores for the 
st igh” and “low” groups was found to be 
tent Stically significant (Kolmogorov-Smirnov 
est, P< 05), 

Sick Call Rate, Sick call records were 
Obtained for 66 men of the well adjusted and 
eee of the poorly adjusted group. pe 
2 8roup the number of men who made 1, 
te 4, etc. sick call visits was tabulated, 

as found that the incidence of these visits 


aS greater f poorly adjusted group 
(k or the Į y PR 


olmogorov-Smirnov test, $ < 05). 
ess half of the members of the well adjusted 
The? had had no attendance at sick call. 
Sine Poorly adjusted group, although the 
o aller of the two, accounted for 66% of the 
tal number of sick call visits of the com- 


bi 
groups. 


Discussion 
ere collected in 


t hg data reported here w ed 
which similar 


Arctic, The extent to 


273 


measures collected prior to Arctic assignment 
would be predictive of adjustment is not 
answered by these data. However, the meas- 
ures which are suggested for consideration in 
the development of a selection battery are 
readily obtained before assignment to an iso- 
lated environment, i.e., biographical data, self- 
appraisal, attitudes toward the job situation, 
anxiety about interpersonal relationships, job 
aptitudes, and job proficiency, judgments by 
peers, and medical record. In general these 
results suggest the hypothesis that individuals 
who adjust well to Arctic isolation are indi- 
viduals who also adjust well to their military 
assignments elsewhere. Isolated environments 
probably present a more extreme stimulus 
situation which more frequently and more 
strongly evokes maladjustive behavior. 

The findings with respect to self-appraisal 
and biographical data replicate the results of 
previous studies (Sharp et al., 1954; Stunkel 
et al., 1952). The scoring key for the self- 
appraisal blank previously developed held up 
in the present study. As in the previous 
studies, a biographical information blank 
again showed some promise as a predictor of 
adjustment. No cross validation of the find- 
ings was performed in the study reported here. 

This study has been concerned with one 
type of isolated environment, namely, the 
isolation of groups of men in geographically 
remote areas. The extent to which the find- 
ings obtained under such conditions are gen- 
eralizable to other types of isolation is of 
theoretical and practical interest. Other types 
of isolation arise, for example, as a result of 
cultural differences, communication barriers, 
personal characteristics objectionable to a 
group, and prolonged confinement as in a sub- 
marine or space ship. 

It is further suggested that study of the 
behavior of individuals in an isolated environ- 
ment for the purpose of minimizing what is 
defined as poor adjustment in that situation 
can involve three approaches: (a) Selection— 
as indicated in this study. (6) Training— 
observations made in the course of this study 
indicate that appropriate prior indoctrination 
might facilitate adjustment. This could con- 
sist of “familiarization training” concerning 
the characteristics of the new environment 


274 


and personal counseling in the use of leisure 
time. (c) Group Structure and Manage- 
ment—the kind of leadership and interper- 
sonal structure which can mitigate any un- 
desirable effects of isolated living needs to be 
studied further (Air Site Project Staff & 
Miller, 1952a). Continued studies of the 
effects of isolation and sensory deprivation 
should uncover variables which need to be 
manipulated in designing conditions of isola- 
tion favorable to adjustment. 


Received October 21, 1958. 


References 


Air Site Project Staff, & Miller, D. C. Human rela- 
tions at A. C. & W. sites. I. Summary of find- 
ings: An interim report of the first year’s work. 
Hum. Resources Res. Inst. Rep., 1952, No. HR-8. 

(a) 

Air Site Project Staff, & Miller, D. C. Human rela- 
tions at A, C. & W. sites. II. Personnel problems, 
Hum. Resources Res. Inst. Rep., 1952, No, HR-9, 

(b) 

Air Site Project Staff, & Miller, 
tions at A. C. & W, sites, 


sonnel. Hum. Resources Re 
HR-10. (c) 


D. C. Human rela- 
III. Needs of site per- 
s. Inst. Rep., 1952, No. 


Leo R. Eilbert and Robert Glaser 


r i ong 

Altus, W. D. Adjustment and food aeran a 13 
Army illiterates. J. consult. Psychol, 1949, 
429-432. san 

American Institute for Research. Tiiesmpiete, ne 
tences Test for Pilots—Form D. Pittsburgh: 
thor, 1953. arch 

Eilbert, L. R., Glaser, R, & Hanes, R. M. : Reri 
on the feasibility of selection of personne! 


regin, RES- 
at isolated stations. USAF Personnel Train. 
Cent. tech. Rep., 1957, No. 57-4. salonit 
Holsopple, J. Q, & Miale, Florence R. Sen 


yo 
completion—A projective method for the oe 
personality. Springfield, Ill.: Charles C 
1952. š r- 
Sharp, L. H., & Harper, Bertha. Selection of Ysa 
termaster personnel for Arctic assignment, 
TAGO Personnel Res. Br. Rep., 1953, Ne p. J- 
Sharp, L. H., Goldstein, L. G., & Bolanvic' a per- 
Further study on selection of armana Per- 
sonnel for Arctic assignment. USA TA 
sonnel Res. Br. Rep., 1954, No. 1089. -DW 
Stunkel, Eva R, Tye, V. M., & Yaukey, ments 
Validation of experimental selection n Res. 
for Arctic service. USA TAGO Personne 
Sect. Rep., 1952, No. 945. anifest 
Taylor, Janet A. A personality scale of is 285- 
anxiety. J. abnorm, soc, Psychol., 1953, aay 
290. = Con- 
U. S. Army, The Adjutant General’s Office. tic as- 
struction of a self-description blank for Arc Rep» 
signment. USA TAGO Personnel Res. Sect 
1949, No. 835, 


ss 


Journal of Applied Psychology 
Vol. 43, No. Hi ‘1959 poe 


STUDIES OF TRANSPARENCY IN FORCED-CHOICE 
SCALES: 


I. EVIDENCE OF TRANSPARENCY 


HOWARD MAHER 


University of Pennsylvania 


Mais in 1951 reported that a forced-choice. 
self confidence key developed from Jurgen- 
Sen’s Classification Inventory was found to be 
fakable. In 1953, Longstaff and Jurgensen 
substantiated the finding. In a study de- 
signed to be more like the situation in which 
the test was intended to be used, e.g., the in- 
dustrial situation, they found important re- 
Sponse changes. More recently, Borislow 
(1958) has demonstrated fakability in the 
Edwards PPS. 

In 1954, Schutter and Maher reported the 
development of a forced-choice study activity 
Questionnaire (SAQ). Maher * has found that 
this test holds its validity for a university 
Situation different from the one in which the 
test was constructed. In the original article, 
Schutter and Maher cited Scates’ (1949) re- 
view of the Wrenn Study Habits Inventory in 
which he noted that the inventory contained 
a number of easy “outs” for the student. The 
authors went on to propose the forced-choice 
test as a means of eliminating such “outs.” 

The present study was designed to test for 
transparency in SAQ. It has the advantage 
of not having to generalize from students’ 
Simulations of perhaps unfamiliar situations 
as in the Longstaff and Jurgensen study. 

ather the student, here, is asked to “beat 
the test” in a situation he has experienced, 
Le. the academic one. The study also was 
Undertaken, to be honest, because the author 
doubted Longstaff and Jurgensen’s (1953) 
Statement that there “. . - is no reason to 

elieve that different results [fakability | 
Would be obtained from any other forced- 
Choice personality test” (p. 89)- 

Finally, there is an attempt to see whether 
structions, specifically a study skills lecture. 
oy increase transparency. If ae 
pe BNE be considered of value i 


1 i + 
Article in preparation. 
275 


tered following “How-to-Study” orientation 
programs. 
Procedure 


The 30-block forced-choice form (SAQ), fully de- 
scribed by Schutter and Maher (1956), was ad- 
ministered to 106 students in each of two introduc- 
tory psychology lectures. The instructions did not 
especially emphasize the honesty aspect. Instead, 
the lecturer mentioned only that this was a research 
project and should be taken seriously, specifically: 

“We are conducting a research project and would 
.ike you to participate in the experiment. If you do 
not wish to take part in the study, which requires 
the serious and accurate completion of a short ques- 
tionnaire, you may leave now.” 

For those participating, a laboratory credit was 
promised. As the questionnaire was passed out, the 
need for seriousness of response was again stressed. 
For recording answers, the students used IBM elec- 
trographic pencils and answer sheets. 

Three days following this, ie. at the next class 
meeting, one of the lecture sections was given a 
study skills lecture, a standard procedure in the 
course. The study was timed to straddle this regu- 
larly scheduled lecture. In the lecture, the material 
to be covered is also prescribed; in fact, it is “tied 
down” by a mimeographed handout provided for the 
students. Thus no attempt was made to avoid or to 
cover the items in SAQ. The handout and lecture 
resemble Hilgard’s (1957) “Management of Learn- 
ing” chapter (Ch. 11). The students were told they 
would be held accountable for this material at the 
next meeting of the class. This group is hereafter 
designated as BI—Beat with Instructions. 

The procedure was also arranged in point of time 
so that the other lecture section would be attending 
a scheduled chapel meeting at the time (also three 
days later) when they would normally have a lec- 
ture. Thus, for psychology class, at least, nothing 
intervenes to make for disturbance of the remainder 
of the design. This group becomes BNI—Beat, No 
Instructions; i.e„ they did not have the study skills 
lecture, prior to the next step. 

One week after the original administration (“hon- 
esty” condition), SAQ was readministered. This 
time, the students were asked to pretend that they 
were applying for admission to the university and 
that they would only be admitted if they obtained a 
high score. No attempt was made to define a “high 
score,” the procedure being equivalent to Longstaff 
and Jurgensen’s “fake over-all good” score. Fur- 


276 


thermore, the investigator told the students eee 
y forced-choice form deliberately designe to 
aiminate “beating.” Finally, the investigator at- 
temples to motivate the students by saying that he 
did not think them capable of “spotting” the cor- 
Key: vers. 

eo oe papers were scored (separately 
for ae BI and BNI groups and for the fe ad- 
ministrations) on the IBM test scoring machine. 


Results 


Is SAQ transparent? The question was 
handled by application of the signed-ranks 
test on individuals’ difference scores from the 
“honest” to the “beat” condition. In so do- 
ing, the BI and BNI conditions were com- 
bined, the test thus being made for trans- 
parency regardless of the effect of having or 
not having instructions. Even without sta- 
tistical test, the answer seems apparent. For 
212 difference scores, only 24 are negative, 
i.e., 24 Ss have scores lower on the “beat” 
attempt than on the “honest” condition. 
Three Ss had no change from the one condi- 
tion to the other. The other 185 cases all 
had score increases. For the total group the 
mean increase in scores is 14, 
lent to one sigma under “honest” conditions. 


argest gain 
The signed-ranks test 
ailed test) beyond the 
rejecting the hypothe- 
in SAQ. 

on of the effect of in- 
Parency, it is first neces- 
e that the BI and BNI 
able under the “honest” 


the two groups start from 
Comparable scores. Table ! shows that this 


is so. For the values in Table 1 Bes 1.58, 
i i at the .05 level, 
able Starting point estab- 


sary to demonstrat 
groups are compar 
condition, i.e., that 


Table 1 
Means and Standard Deviations 
Groups under the “Honest 


for the BNI and BI 
y” Condition 


Howard Maher 


lished, it is possible to test the gains for 
BI vs. the BNI group. The BI group = 
gain is 12.9 points (SD = 13.34). The Pd 
with no instruction (BNI), on the a E 
gained 15.4 points (SD = 12.95). that 
ference is in the direction opposite om bse 
hypothesized, i.e., it had been expecte Ms 
if there were any difference in gain scor e 
would favor the group with instructions. pi 
Mann-Whitney U (two-tailed) proyices T 
09. Thus one must conclude that ins seo 
tions have no additional effect upon tra 
arency. 5 
P What is the result of this sca ene 
The product-moment correlation be BNI 
“honest” and “beat” scores for me ai. 
group is r= .09; for the BI group 7 nf the 
Of greatest importance. the validity 95 for 
test under the “beat” condition (N = Te) is 
whom grade point averages are ne, 
r= .10 for BNI; r= 07 for BI (NV = 4s 
Under the honest Conditions validities are - 
for BNI and .56 for BI. 


Discussion 


There is shown here a condition a 
to those found by others. Thus, ree 
forced-choice scales are specifically ewe 
to greatly retard faking, there is choke 
strated that, for still another forced-c cur 
form, Statistically significant increases ane 
when these devices are subjected to press a- 
In this instance, the conditions are e Long- 
miliar to the Ss used, whereas, in the ske 
staff and Jurgensen study, students are iot 
to simulate a perhaps unfamiliar gon ‘sible 
Most important, in this study, it 5 poring” 
to study directly the effects of such “bea 
attempts upon the validity of the test. w 

he average gain found in this a en 
14 points on a possible 79-point range, sual 
under the honest condition SAQ has us 
yielded mean scores at approxi? n 
midpoint of this range. There is cenit 
strated, here, not only a statistically sion 
Cant gain from “honest” to “beat” con ‘Thus 
but one of practical significance as well. f the 
the mean shifts approximately 35% ae the 
distance from the midpoint to the top ° 
Possible score range. . distor 
One could argue that even greater ©” 


as 


p 


Y 


Studies of Transparency in Forced-Choice Scales 


tion would be obtained with the usual study 
habits measurement device, i.e., perhaps these 
instruments are virtually perfectly transpar- 
ent and scores could be moved to the very 
top of the range. However, one cannot ignore 
the fact that under “beat” conditions SAQ 
loses all validity. The mere fact that the 
forced-choice design may have had some re- 
tarding effect on the mean shift, then, does 
not preserve its validity. With this finding, 
also, it becomes clear that we could not 
merely construct norms for “no pressure” 
conditions (e.g., counseling and guidance) 
and others for administrative pressure condi- 
tions (as academic selection or sectioning). 

he present form of SAQ, thus, will be of 
Value only in those cases in which such pres- 
Sures do not exist and where honesty of re- 
Sponse may be safely assumed. 

Another pertinent finding is that transpar- 
ency is not increased via a rather typical 
Study skills lecture. As a matter of fact, 
there is a slight, but not statistically signifi- 
cant, tendency for a group not having such 
information to “beat” the test more than 
those having the lecture. The finding of no 
difference attributable to information is not 
entirely surprising. For one thing, not all of 
the weighted items in SAQ are covered in 
even more extensive study orientation pro- 
8tams or texts. More to the point, not all of 
the “advice” given in such sources receives 
weight in SAQ’s scoring system. The author 

elieves, furthermore, that the lecture given 
Was typical enough so that no general infor- 

ation on “how to study” would affect scores 
On the test, Thus we appear to have the ad- 

‘tional finding that (again), assuming COn- 
ditions of honesty, SAQ could be used fol- 
Owing such orientation and would not as @ 

fect consequence be contaminated. 
re major point remains, ae aed 
Sen tot agree with Lanet a a he cat 
sig Pat any forced-choice scale mu : 
idereq potentially beatable. We are, there 
he faced, in the future, with enonsa 
sige Sources of transparency in a 
ent €s. Such is beyond the intent © p = 
Ves Study, it being the purpose here a 
Si tigate the presence of, rather than 

Urces of, transparency. 


Owever, several transparency hypotheses 


277 


are here set forth. It is the intention of the 
author to systematically examine these in fu- 
ture investigations. 

1. Schutter and Maher (1956), in the 
original article, noted a strange operation of 
the forced-choice form used with SAQ, i.e. 
the Richardson system. “. . . it is clear that 
they [students] do not resist unfavorable al- 
ternatives to the extent that raters do. These 
latter deny favorable alternatives or admit 
unfavorable ones less frequently. In this case 
32% of the responses are of this nature” (p. 
257). 

Certainly this finding would make a form 
extremely “beatable.” All the student has to 
do, after denying favorable statements or ad- 
mitting to unfavorable statements in the 
“honest” condition, is shift away from such 
responses in the “beat” condition. 

As a rough check on this, the first 14 items 
in SAQ were item analyzed using the IBM 
Graphic Item Counter. The analysis was 
performed for the response “most descrip- 
tive” only. For these items there was a total 
group gain of 1250 points on the most de- 
scriptive side from the “honest” to the “beat” 
condition. Of this gain, 33% is accounted 
for by shifts from a negative statement listed 
as “most descriptive.” In other words, for 
these items 33% of the transparency is ac- 
counted for by use of the Richardson system 
in a situation in which it rather obviously 
does not operate as expected. The finding 
should be fully investigated for all items on 
both the most descriptive and least descrip- 
tive responses. For the present, however, it 
would appear that the Richardson format, in 
the study skills area of measurement, con- 
tributes to transparency. It is also of inter- 
est to recall that Highland and Berkshire 
(1951, p. 35) found a Richardson rating form 
and a similar one less bias-resistant than other 
forced-choice forms in which favorable and 
unfavorable statements are not mixed in the 
item. : 

2. Highland and Berkshire (1951), in their 
excellent investigation of some of the basics 
of forced-choice methodology, have speculated 
on the possibility that, in the pairing of items 
(equal on some appearance index, e.g.. pref- 
erence index; unequal on discrimination in- 
dex), large discrimination index differences, 


278 


while making for increased validity, might 
also make for increased transparency. 

3. To the best of the author's knowledge, 
appearance pairings have traditionally been 
pairings on mean appearance scores. The 
standard deviations of the appearance indices 
apparently have been ignored. = 

4. Appearance indices are traditionally es- 
tablished out of the context in which they 
are eventually to be used. Often a question- 
naire stage is first resorted to, each item in 
the questionnaire being rated, one by one, on 
some appearance scale. In the forced-choice 
stage, on the other hand, they appear more 
directly in comparison with one another, in 
a particular context or “field.” It might be 
possible to pretest such “field” conditions 
through such treatments as paired-compari- 
sons ratings or some variation of these, 

Finally, although the author must now 
agree with the conclusion of Longstaff and 
Jurgensen that all forced-choice devices could 
be transparent, he would prefer to emphasize 
the fact that they are not necessarily so. But 
only systematic investigation will lead to the 
Sources of transparency. These should yield 
generalizable information on the construction 
of valid and nontransparent scales, 


Summary 


There is demonstrated here for a study ac- 
tivity questionnaire (SAQ) a condition found 
by others, i.e., forced-choice forms are, to 
serious extent, transparent. 
there has been fou 


cant upward shift 
est” to “beat” 
vestigations. 
For the first time, to the best 
thor’s knowledge, the study shows directly 
the serious effect of such a shift, i.e., validity 
disappears under “heat” conditions, SAQ 
could still be of value in those instances in 
which pressures to increase scores do not 
exist. In this event, however, there is no par- 
ticular advantage to SAQ’s forced-choice 


a 
In this study 
nd the Statistically signifi- 
in mean score from “hon- 
conditions seen in other in- 


of the au- 


Howard Maher 


format and its use would have to be pa 
mined solely in terms of its validity coe’ 
ients in “honesty” situations. 
: The author would not conclude, as E 
others (Longstaff & Jurgensen, 1953), mee 
forced-choice tests are transparent an i 
other techniques must be devised. = ie 
equivalent to tossing out the baby wi siak 
bath. Instead, he feels that we must es 
for the sources of transparency in these a 
wise promising instruments. Several s0 ae 
are hypothesized with the hope that, aa a 
tematically investigated, they will ena mae 
to construct truly nontransparent a eal 
naires. One such source is partially ee 
gated here, indicating that the mee 
and similar formats may increase the t for 
parency in the framework of measuremen 
which SAQ is intended. that 6 
Finally, the study demonstrates Ahd 
“how to study” lecture does not furthe 


ues- 
crease the transparency of the study 4 
tionnaire. 


Received December 22, 1958, 


References 


Borislow, B, The Edwards Personal P 
Schedule (EPPS) and fakability. J. 4? 
chol., 1958, 42, 22-27, tho 

Highland, R. W., & Berkshire, J. R. A me 


pl. PSY 


dologi- 


ating- 

cal study of forced-choice performance 1931 
USAF Hum. Resour, Res. Cent. res. Bull» 

No. 51-9, y, (Qod 
Hilgard, E. R. Introduction to psychology- 


ed.) New York: Harcourt Brace, 1957. pility f 
Longstaff, H. P., & Jurgensen, C. E. Fab appl- 
the Jurgensen classification inventory. 
Psychol., 1953, 37, 86-89. r 
Mais, R. D. Fakability of the classifica 
tory scored for self confidence. J. appl- 
1951, 35, 172-174, tg Inventory: 
Scates, D. E. Review of Study Habits q measure 
In O. K. Buros (Ed.), The third menta . Rutgers 
ments yearbook. New Brunswick, N. J-i 
Univer, Press, 1949, Pp. 566-568. 


n inven- 
psychol» 


ade- 

icting 8, 

Schutter, Genevieve, & Maher, H. Predici activit” 
point average with a forced-choice studs 0, 25 


questionnaire, 


J. appl. Psychol, 1956 
257. 


reference 


ihe 


Journal of Applied Psych v 
Vol. 43, No. ip ‘1959 as 


RELATIONSHIP OF THE GUTTMAN COMPONENTS 
OF ATTITUDE INTENSITY AND PERSONAL 
INVOLVEMENT `+ 


LANE H. RILAND 


Eastman Kodak Company, Rochester, New York 


There has been little or no published re- 
search on Guttman’s fourth principal com- 
Ponent of scalable attitudes, involution, other 
than his article (1954) describing the theory 
n postulating this component. In an earlier 
article, Katz (1944) stated that in the at- 
tempt to more closely define the attitude and 
Predict responses questions of “personal in- 
volvement” were used but were not successful 
Mm predicting responses to related content 
items, Guttman, in his study, measured per- 
Sonal involvement in listening to broadcasts 
of the Voice of Israel by such questions as 

How often do you listen to the VOI?” and 

© you make it a point to open the radio at 
Certain hours for certain programs?” These 
data were utilized to determine the re- 
Spondent’s “involution” regarding this area of 
attitude. 

His hypothesis in relating these data to 
Content responses was that the scores on per- 
Sonal involvement in listening to the broad- 
Casts, when plotted against the content scale 
Scores on favorableness (“Do you think the 

Toadcasts are good in general?”), would 
Yield an M-shaped curve with the middle low 
Point or zero-point of involution or involve- 
Ment at the same point along the content 
Continuum as that of the zero-point of the 
tensity curve for the same data. The inten- 
sity component was measured by questions 
regarding how strongly the respondents felt 
about their expressed attitudes of favorable- 
ness or unfavorableness. ‘ 

As noted by Guttman, prior to the time 
When the data on involution were processed, 
US associates predicted that the resultant 
Tve of involution would assume the same 


1 


disgi paper is based on a portion of a doctoral 
Si 


ertation, Pennsylvania State University, 1958, 

as. der the direction of Lester P. Guest. The financial 

thi tance of the Hamilton Watch Co. in conduc ng 

Ae Study is gratefully acknowledged. The author 7 

fa, indebted to C. C. Upshall and S. M. Newhal 
their helpful criticism of the initial draft. 


279 


characteristic shape as that of the intensity 
component, i.e., U or J shaped, with those 
respondents most and least favorable being 
the most involved personally in their atti- 
tudes. The results of the study, however, 
were in agreement with Guttman’s theoretical 
prediction in that he found the M-shaped 
relationship, indicating that the respondents 
who were both most and least favorable in 
their attitudes toward the broadcasts were 
also the least personally involved in their 
listening habits regarding these broadcasts. 
He stated that this showed a “prejudiced” or 
“unreasoned”’ attitude. 

Evidence pointing to the possible confusion 
of these two components of intensity and 
involution may be found in the discussion of 
the problem of error in the relationship of 
content and intensity by Guttman and Such- 
man in an earlier article (1947). They de- 
scribe situations where respondents having 
answered “Undecided” to a content item 
would answer “Very strongly” to the stand- 
ard intensity item—‘How strongly do you 
feel about your answer to that (content) 
question?” —which followed the content item. 
When asked why they answered in this way, 
many respondents said they meant the prob- 
lem in question was “Very important.” As- 
suming that, as in all questionnaire research, 
differing verbal habits may cause respondents 
to think quite differently about what a certain 
term means, there still appears to be some 
indication that the usual intensity question— 
“How strongly do you feel about your an- 
swer?”—may often be confused with the 
typical item used to measure Guttman’s com- 
ponent of involution, e.g., “How important is 
this to you personally?” 


Problem 


The problem in this investigation was to 
determine whether the involution-type metric 


280 


would yield the M-shaped curve hypothesized 
by Guttman, and also to study the relation- 
ship of the intensity function and that of 
involution. In this study it was hypothesized 
that the functions of these two components 
were more closely related from the standpoint 
of respondent perception than indicated by 
Guttman in his primarily mathematical for- 
mulation (1954). The hypothesis was that 
when involution scores were plotted against 
content scores, the resultant regression would 
not be M shaped, but would more closely ap- 
proach the U- or J-shaped curve characteristic 
of intensity scores when plotted against con- 
tent scores. Those respondents most involved 
in their attitudes would be the most and least 
favorable on content. A significant, positive 
statistical relationship between the intensity 
and involution scores was also hypothesized. 


Procedure 


The intensity and involution scores were derived 
from the responses of a random sample of 388 resi- 


Lane H. Riland 


dents of a central Pennsylvania community to 4 
questionnaire regarding their attitudes toward a local 
company. Using Guttman scale analysis, a six-item, 
“general attitude” scale was developed with a repro- 
ducibility of .88. Although the number of scale 
items was somewhat below the number recommended 
by Guttman for unidimensional scales, it was felt 
that their reliability had been adequately established 
according to Guttman theory in that they scaled for 
the entire survey population. b 

Intensity was measured by three, two-part inten- 
sity items attached to content items which met the 
Guttman requirements for scalability over the entire 
sample. The intensity score consisted of the nen 
of the scale values of the responses to these fare 
items. In this two-part technique, a separate ite 
followed the content item and asked the responden 
“how strongly” he felt about his answer to the Re 
ceding content item. This technique was felt o y 
superior to the “foldover” technique (Suchman, 195 PA 
sometimes used in intensity measurement, aS it? 
a more independent measure. 

Tnvolution was measured by three items als 
tached to scalable content items, These items 
designed to obtain an index of the respon i 
personal involvement in his thinking about aa 
concerning the company: whether (a) they W° 


o at- 
were 
dents’ 


Table 1 


Relationship of General Content and Two-part Intensity Scores 


General Content Score 


N = 388 


Intensity - ~n 
Score 7 0 1 2 3 4 — PE j Cum. “o 
- E S a o Sa 
it : : i = 3 4 12 100 
10 1 ‘i 1 1 1 16 32 9 
9 ae 3 a 7 16 1 46 89 
8 1 aj F258 3a ap 104 yi 
7 1 z S & # 26 17 73 50 
5 7 Ja 3 
6 2 1 i E: 14 4 48 31 
5 > “ 7 15 2 38 19 
4 — c: 4 2 3 — 11 9 
3 ni 7 2 a l 4 1 11 h 
2 $ F ! 4 1 = i 3 3 
1 > k 7 f 1 1 3 : 
o 1 1 = . = = - : 0.5 
12 a7 g j 
1 “ 5 E e 130 77 388 
Cu 5 0 9 W y 80100 
Midpoint of Content 
Percentile 15 6.5 145 24.5 38.5 63.5 i 
Median of Intensity f 
Percentile 59 48 34 m m 


^ Cell containing median for each content score, 


51 52 67 


Attitude Intensity and Personal Involvement 


Table 2 
Relationship of General Content and Involution Scores 
N = 388 


Involution 


| 7 soluti > i 2 3 4 5 6 y Cum. % 
l 15 3 4 3 3 R 17 - T 2 

14 = 1 3 3 6 oe s x 70 
13 1 54 i 8 19 a = 
12 1 1 6 3 oe ; a i 
n » 2 3S 8 Se A s 36 
10 2 š 5 lœ 8 u : - 2 
9 1 2 1 1 3 a ; 30 is 
8 = 3 3 7 7 T i z i0 
7 = = 2 4 l ; ) i? 7 
6 sx 3 p3 3 3 # H % 5 
5 = : 3 ' = 3 7 
4 1 — : = - : : 
3 - 0 0.3 
- - = 0 0.3 
= - = = = — 0 0.3 
i - - 1 0.3 

J 12 27 3 å 2 6 130 17 a 

— i m 19 30 47 80 100 

Midpoint of Content 

Percentile i5 65 145 245 38.5 635 90.0 

Median of Involution 

Percentile 42 si i 7 = : 

‘se i 


Cell containing median for cach content score. 
of development 


A 
in A . 

terested in” or “kept abreast’ at itl 
concerned” wit 


fing (6) they were “personally 
eg developments there, and (c) the matter of 
tgs Pany “keeping up with other companies in 
Scores Was important to them personally. The scale 
Ca On these three items were totaled to obtain 
Mvolution score. 
Plotted median intensity score percentiles were then 
Percen „against the midpoints of the content score 
Pany tiles on the general attitude toward the com- 
Y, as described by Suchman (1950), resulting in 
intensity curve. The involution scores were 
ae against the content scores in the same fashion 
tain the involution curve. 
cout square test of significance was U 
ine fies of of contingency was calculated to ¢ 
inyo. Statistical relationship of the intensity 
Ution scores. 


e 
Plott, 
t 


tilized and 
1 to deter- 
and 


Results 


T . 
d E 1 shows the relationship of the gen- 
inte attitude content scores and the two-part 
Nsity scores from which the intensity 


curve was plotted. This curve appears in 


Fig. 1. 

The relationship of the general attitude 
content scores and the involution scores is 
shown in Table 2. 

Figure 1 shows the intensity curve in this 
study approximates the U or J shape typically 
found in the Guttman analysis of intensity. 
The zero-point, or point of indifference, falls 
at approximately the 25th content percentile. 
This, according to Guttman theory, is the 
point which separates the favorable and un- 
favorable respondents on the content con- 
tinuum (Guttman & Suchman, 1947). 

Figure 2 shows the involution curve found 
in this study as well as the M-shaped curve 
of involution postulated by Guttman. It can 
be seen that the involution curve in this study 
does not assume the shape postulated by 
Guttman, and that in this study the respond- 


80 


60 


INTENSITY PERCENTILE 


20 40 60 80. 100 
unfav. fav. 
GENERAL CONTENT PERCENTILE 


Fic. 1. Intensity curve showing the relation of inten- 
sity scores to general attitude content scores, 


ents who were most and least favorable were 
on the average more involved rather than 
least involved, as predicted by his theoretical 
formulation. 

The intensity and involution curves found 
were superimposed on the same matrix in 
Fig. 3. This comparison shows that these 
two components had quite similar regressions 
when plotted against the general attitude con- 
tent percentiles, 


In his study, Guttman presented no statis- 


1N hypothesized A 
1 \ Guttman 1) 
\ Curve —»/ 

\ 


20 


INVOLUTION PERCENTILE 


/ involution \ 
curve 


I 


(0) 
unfav. 


20 40 60 80 f 
av. 
GENERAL CONTENT PERCENTILE 


Fic. 2, Involution curve showing the relation of 
involution scores to general attitude content scores 
as well as the curve postulated by Guttman, 


100 


Lane H. Riland 


8 


8 


o 
fe} 


20 intensity -—— 


involution ---0-°- 


INTENSITY~INVOLUTION PERCENTILE 


o0 
40 60 1 


89 fay. 
GENERAL CONTENT PERCENTILE 


intensity and 
Fic. 3. Comparison of the curves of intensity 


. ; ontent 
involution on the same general attitude © 
matrix, 


o 20 
unfav. 


tical correlation of the intensity and ivor 
tion scores found, but preferred only to Er 
pare the coincidence of zero-points and yo 
locations of the bending points of gl = 
curves. Table 3 shows that in this S$ ug 
the coefficient of contingency of .28, altho 
not extremely high, was very significant. , 82 
maximum C possible for this size table ÍS + 
(Siegel, 1956), 
Discussion 


The hypothesis that the curves of intent 
and involution would closely appo cai: 
each other appeared to be generally su “wit 
tiated. This finding is not in agreement ting 
Guttman’s original findings in postua 


Table 3 


uig x : ity and 
Statistical Relationship of Intensity an¢ 
Involution Scores 


N = 388 a 
Involution Score _ _— 
Intensity 12-15 
Score 0-6 7-11 
27 
9-12 5 62 A 
5-8 18 78 5 
5 4 
0-4 5 1 — 


Chi square = 34,53, 


df = 4; significant at less than the .001 level. 
Coefficient of contingency = .2 


8. 
Maximum C possible for 3 X3 table = .82. 


these two supposedly different components of 
, Scalable attitudes. In the present study the 
respondents who were most involved were 
those who were most and least favorable in 
their attitudes toward the company. 
In his original article, Guttman (1954) 
stated that those who were most and least 
favorable, and had little or no personal in- 
, Volvement in the VOI broadcasts toward 
Which the attitude was expressed, showed a 
prejudiced” or “unreasoned” attitude. He 
stated that this type of “prejudice” related 
to the lack of personal involvement in radio 
may be called that due to the “absence of 
reasoning from lack of personal contact with 
radio.” He also stated that there may be 
another type of “prejudice” which fits his idea 
of involution for other types of data which 
might be called “cessation of reasoning even 
though there is close contact.” This thinking 
> Was based on the results of a previous study, 
also reported in his original article (1954), 
Which resulted in the postulation of his third 
Component of “closure” measuring generally 
Whether or not the respondent had “definitely 
made up his mind” on the attitude issue in 
question. The aspect of closure was not in- 
Vestigated in this present study. In the origi- 
nal study, Guttman defined respondent in- 
Volution as “turning the attitude over and 
Over within himself,” but measured it as per- 
Sonal involvement, as was done in this present 
| Study, ’ 
. This “prejudice principle” was not evident 
it this study. The more favorable respond- 
ents did appear to be slightly more involved 
an those least favorable, but the findings 
t 5 not indicate that even these least favorable 
“Spondents were the least involved in their 
titudes, 
bese + zero-points of both t 
i ution curves do genera 
Stulated by Guttman, however. 
Perfect statistical correlation between 
sity and involution was not expected, 


as Guttman’s associates in the Israel 
as Guttman and 


their analysis of 
to be some 


f these two 
d by items 
Perhaps re- 


he intensity and 
lly coincide as 


inten, 


k ii 
gly (1954) predicted, and 


inten m2 (1947) found in 
questi error, there appears 
Co, 0n as to the distinctness ° 
Ponents, at least as measure 


Wordeq : i 
ed in the suggested fashion. 


Attitude Intensity and Personal Involvement 


283 


searchers using these two components should 
undertake a more intensive analysis of the 
wording of these items, particularly with re- 
gard to personal involvement, in order to 
more clearly define the interrelationships be- 
tween them, if they are in fact measuring 
distinct components of scalable attitudes as 
postulated by Guttman. 


Summary and Conclusions 


A random sample of 388 residents of a 
central Pennsylvania community were sur- 
veyed regarding their attitudes toward a local 
company. Guttman scaling techniques were 
applied, and a six-item scale of “general atti- 
tude” resulted, with a reproducibility of .88. 
These six items scaled for the entire survey 
sample. The respondents’ attitude intensity 
and personal involvement (involution) in 
their attitudes toward the company were 
measured and analyzed by the techniques sug- 
gested by Guttman to test his theory that 
the intensity and involution components would 
show U- and M-shaped regressions, respec- 
tively, when plotted against general attitude 
content (favorableness). 

It was hypothesized that the intensity and 
involution regressions would show similar 
curves, and that those respondents who were 
the most and least favorable toward the com- 
pany would also be the most involved in their 
attitudes toward the company, and not the 
least involved as predicted by Guttman. It 
was also hypothesized that there would be a 
significant, positive statistical relationship be- 
tween the scores on intensity and involution. 

In light of the results of this study, the 
following conclusions appear to be justified: 

1. The regression of involution scores 
against general attitude content scores re- 
sulted in a curve quite similar in shape to 
that of the intensity scores When plotted 
against content scores, indicating that those 
respondents most involved in their attitudes 
toward the company were on the average the 
most and least favorable in their attitudes. 

2. There was a very significant, although 
not extremely high, positive relationship be- 
tween the intensity of the attitudes expressed 
and personal involvement in the attitudes 


toward the company. 


284 Lane H. 

3. More research is needed on these two 
components to more clearly define the inter- 
relationships between them. 


Received November 17, 1958, 


References 


Guttman, L. The principal components of scalable 
attitudes. In P. F. Lazersfeld (Ed.), Mathematical 
thinking in the social sciences. Glencoe, Ill.: Free 
Press, 1954. Pp. 216-257. 


Riland 


A. Intensity and a 
Amer. sociol. Rev» 


Guttman, L., & Suchman, E. 
zero-point for attitude analysis. 
1947, 12, 57-67. 

Katz, D. The measurement of intensity. In H. Can- 
tril (Ed.), Gauging public opinion. Princeton, 
N. J.: Princeton Univer. Press, 1944. Pp. 51-65. 

Siegel, S. Nonparametric statistics. New York: Me- 
Graw-Hill, 1956. P 

Suchman, E. A. The intensity component in attitude 
and opinion research. In $. Stouffer et al, Meas- 
urement and prediction: Studies in social BF 
chology in World War II. Princeton, N. 
Princeton Univer. Press, 1950, Pp. 213-276. 


an — 


-A 


J 


Journal of Applied Psychology 


VoL. 43, No. 5 


OCTOBER, 1959 


PREDICTING TRADEMARK EFFECTIVENESS 


HARRY A. BURDICK, EDWARD J. 
Dartmo 


A Under the sponsorship of Raymond Loewy 
ae we undertook the following study. 
3 he task which confronted us may be related 
in terms of a single question, namely: “How 
i ai is a given trademark in comparison 
© six other trademarks, one of each of the 
Product's leading competitors?” The ques- 
is a simple and reasonable one, un- 
Pi of major significance to all adver- 
toe The answer is not so simple, mostly 
ag the word “effective” is so difficult to 
ge In the following report, we will pre- 
a the definition which we used and the 
Orrelates we found to this definition of effec- 
tiveness, 

wens decided that for a trademark to be an 
effective trademark it must have the follow- 
NS properties: 

1. Salience—it must be readily seen and 
recognized. 

2. Meaningfulness—it must convey, through 
*Ssociations, connotations which are favorable 
and significant to the observer. 

Eo Memory-value—it must be readily re- 
embered by the viewer. 
t With these criteria, we set out to investigate 
toa relative effectiveness of each of seven 
ademarks. 
d 
Subjects Metho 
t a 166 male Ss were tested. A proportion of 
in, S (40) participated in the salience and mean- 
z phase of this experiment. The other 126 partici- 
Soar the meaning and memory-value phase of the 
Ty. ment. All Ss were paid for their participation. 
Vision? Ss who performed in the salience aspect had 
Corrected to 20/20. 


Materials 


Offici ; j 
Official trademarks of seven companies were ob- 


tai 

ofie These were enlarged photostatically and the 
trans colors applied by air brush. Ektachrome 
is r h of the emblems. 


~ Parencies were prepared of eac! 


W. Lovelace is with the Raymond Loew 


J: > As- 
SOciates. y As 


GREEN, ano JOSEPH W. LOVELACE? 
uth College 


In the memory-value aspect of the study, all of the 
trademarks were presented on one slide. Four dif- 
ferent arrangements of the trademarks were photo- 
graphed for this, so that whatever effects position 
might have would be reduced. Ektachrome trans- 
parencies of the composites were produced. 


Salience and Meaning Procedure 

Ss appeared in a testing laboratory individually. 
They were met by E, seated before a screen, and 
told that words would be flashed upon the screen. 
S was to report everything he saw. Slides were 
shufiled before each series to control the effect of 
position in the series. For each S, the point at 
which correct identification occurred was recorded 
by E. 

The slides were shown through a matched pair of 
300-v. projecting tachistoscopes which were placed 
6} ft. from the screen, The screen was held a con- 
stant illumination to reduce the reading of after- 
images. Preliminary investigation showed that no 
one recognized the trademarks at exposure intervals 
less than 1/50 sec. without repeated presentations. 
Therefore, our series began here, with all slides being 
presented at 1/50 sec., then all slides at 1/25 sec., all 
at 1/10, 1/5, 1/2, and 1 sec. After this part of the 
study was completed, S was given a ten-item copy 
of the Semantic Differential described by Osgood, 
Suci, and Tannenbaum (1957). He was shown each 
slide for as long as he wished and evaluated the 
trademark in terms of 10 seven-point dimensions. 
This is our operational definition of “meaning.” 


Memory-Value and Meaning Procedure 

Groups met and were told they would be asked 
to view a number of slides. Half of the group was 
sent out of the room while the others remained. 
The ones remaining were provided with forms for 
the Semantic Differential and asked to turn the 
forms over and write on the back. The technique 
used to test memory-value was the Aussage tech- 
nique. 

One of the four transparencies containing a view 
of all of the trademarks was presented to the group 
for 15 sec. Then Ss were asked to write down all 
of the trademarks which they recalled. The pro- 
cedure was duplicated for the other half of the 
group, using a second slide with different positional 
arrangement. After responses were made to the com- 
posites, each trademark was presented alone, and Ss 
jn the group were asked to evaluate it on each of 
10 dimensions of the Semantic Differential. 


285 


286 


Results 
Salicnce 


Mean recognition thresholds were computed 
for each of the trademarks. Trademarks 
were then ranked, with the trademark having 
the lowest threshold receiving a rank of one. 


Meaning 

Following a recommendation of Osgood 
et al. (1957), the characteristics of evalua- 
tion, potency, and activity were taken to be 
most significant in designating “meaning,” 
Good-bad, strong-weak, and active-passive 
were taken to be the focal dimensions of 
evaluation, potency, and activity. Scores on 
these dimensions were summed for each trade- 
mark, and a rank ordering from most to least 
along the dimension was obtained. The rank 
on the evaluation dimension was doubled, as 
Osgood et al. suggest, and added to the rank 
of the other two dimensions, A final rank 
order of the trademarks for the three 


dimen- 
sions combined was thus obtained. 


Memory-Value 


A score of seven was assigned to a trade- 
mark every time it was the first of the group 
which was written down by S, a six for each 
time it was second, and so on. If it did not 
appear at all, it was given a score of zero. 
From the totals of the summe 


and so on. 


y Using a Spearman rho, rank order correla- 
tions were computed for each of the “aspects 
of effectiveness with one another. On the 
basis of face validity, the memory-value as- 
pect of our effectiveness definition was felt to 
be most critical. Hence the Correlations of 
salience and meaning with memory-value were 
of special importance. Finally, a composite 
rank order was composed by Combining the 
ranks of the salience and meaning dimensions, 
giving them equal weight, and this Was cor- 
related with the memory-value order. The 
matrix of correlations is presented in Table 1. 

As can be seen in Table 1, all of the as- 
pects correlate positively with one another. 
Of particular interest is the exceptionally high 
correlation between memory-value and the 
composite ordering of the salience and mean- 

ing aspects. 


H. A. Burdick, E. J. Green, and J. W. Lovelace 


Table 1 


Intercorrelation of Effectiveness Measure 


V.T. S.D. Aussage 
—— — noe 
“9% Ao 

Visual Threshold = a 82* 
Semantic Differential na 2 
Aussage 94** 
V.T. and S.D. Combined pue. 

* Significant at .05 level. 

** Significant at .01 level, 

Discussion 


We feel that the results of this study ta 
cate a great deal of power is to be obtal ar- 
from a straightforward approach to a Fhe 
ticularly practical but slippery problem. from 
approach has involved techniques taken the 
two subdisciplines of psychology, namely, op- 
sensory and cognitive areas. Through an we 
portune choice of variables in these — 
have demonstrated the significance 0 dst- 
variables for either an evaluation of an sail 
ing emblem or of an original, new bears 
This obviously does not mean that the will 
effective trademark, by this dennon n 
necessarily give rise to an increase in sa sym- 
connote integrity or honesty of the = 
bolized. But it is even more apparen d 
such desired ends are not to be a a- 
making the trademark less visible or les to- 
vorable in its meaning. The first seni i 
ward “effectiveness” in this latter ie a 
the making of the trademark as effectiv 


1 
: - a b emo 
possible in salience, meaning, and ™ 
value. 


Summary 


of 
We have investigated the effectiveness is 
seven competing trademarks. Effective and 
defined in terms of the salience, mean pre’ 
memory-value of the trademark. m ly re 
characteristics were found to be positiv be 
lated. Further, taking memory-value found 
our major dependent variable, we pawsalience 
that, through a combination of our S pre 
and meaning measures, we were able ree O 
dict the memory-value with a high deg 
success. 


Received December 29, 1958. 


Reference E 


P. 
, um, “er 
Osgood, C. E. Suci, G. J., & Tannet "pnive 
The measurement of meaning. Urba 
Illinois Press 1024 


h 


Journal of Appli yi $ 
Vol $3, No a tosg Te 


STUDIES IN MANAGEMENT TRAINING EVALUATION: 
II. THE EFFECTS OF EXPOSURES TO ROLE PLAYING? 


C. H. LAWSHE,? ROBERT A. BOLDA,’ anv R. L. BRUNE + 


Occupational Research 


Human relations training in management is 
eeina by variability with respect to 
ing content and training technique. To 
Ee extent that the content is appropriate to 
the accomplishment of the training, objective, 
oo training technique is capable of fa- 
tat io learning, the training presentation 
aie considered effective. The empirical 
F % lon of both training content and train- 
aT ae is required. In particular, it is 
„that research in the now-popular partici- 
Pative training techniques is necessary; such 
investigations would indicate the particular 
Virtues of each approach and the specific 
types of applications which result in optimal 
effectiveness. 
te series of studies has been undertaken to 
= uate role playing as a tool in manage- 
Ta human relations training. While there 
several variations of role playing, experi- 
mental interest has been centered on the most 
Popular industrial method, skit completion. 
n this approach, trainees are presented with 
4 case involving the development of a prob- 
a tiatn and are required to enact a 
roa etion of the scene spontaneously. For 
the ple, one trainee may be asked to play 
and Part of the foreman, one the employee, 
are ee some instances, the remaining trainees 
sivel equired either to observe the action pas- 
D Y or to identify with one of the role 
ee, It is the purpose of this article to 
marize five studies which have been con- 
ae to evaluate the effects of single and re- 
fete T role playing experiences. Three groups 
Dat, ved one session, and two groups partici- 
p= 2 foo weekly role playing sessions. 
his research was carried out under a grant from 
a minao dation for Research on Human Behavior, 
tion, stered through the Purdue Research Founda 
sity Now with University Extension, Purdue Univer- 
Mone with Chevrolet Engineering Center, General 
4 Corporation. 


kaj ett k ircraft Corporation. 
Missiles a Lockheed Aircraft Corp : 


Center, Purdue University 


Evaluation criteria. A “work sample” hu- 
man relations training case was administered 
to trainees in pre- and posttraining situations 
in order to obtain indices of human relations 
performance levels. An incomplete-type case, 
the Case of the Reddened Eyes,’ was selected 
for this purpose after considerable preliminary 
research (Lawshe, Bolda, & Brune, 1958). 
This case presents the development of a fore- 
man-employee problem situation, and con- 
cludes at the point where some form of su- 
pervisory action is required. After viewing 
the sound-slide film, trainees are asked to 
give open-end responses to the following ques- 
tions: 

1. If you were the foreman, what would 


you do now? 
2. Why did the employee behave the way 
she did? 


Responses to the first questions were scaled 
on a continuum of Employee-orientation, in- 
dicating the extent to which the response 
tended to cope with the human problem pre- 
sented in the case. Responses to the second 
question were scaled on a continuum of Sensi- 
tivity, reflecting the extent to which the re- 
sponse indicated a perceptiveness of the so- 
cial cues presented in the film. Scaling pro- 
cedures are described in an earlier article in 
this series (Lawshe et al., 1958). It is pos- 
sible to evaluate the effects of the training 
experiences by comparing pre- and posttrain- 
ing response scale scores. 


Procedure 


A campus conference group of residential 
ed in this study.6 At 
Case of the Reddened 
o the two criterion 


Study 1. 
contractor foremen participat 
the outset of the period, the 
Eyes was shown and responses t 


raw-Hill News Bulletin 
filmstrip series: Super- 


5 Description given in McGr 
L-23129 TF, announcing the 


visory Problem in the Plant. 
6 The authors wish to acknowledge the coopera- 


tion of George E. Davis and Merle McClure, Divi- 
sion of Adult Education, Purdue University. 


287 


288 


questions were obtained. Forty of the foremen were 
randomly divided into two role-playing frealment 
groups. The first 20 were randomly assigned to five- 
man role playing subgroups. Within each sub- 
group one S was assigned to each of the following 
role positions: foreman role player, worker role 
player, foreman identifier, worker identifier and ob- 
server. The role players were instructed to enact a 
completion of the criterion case. A graduate student 
leader in each subgroup got the sessions started and 
led a post-role-playing discussion period. Role play- 
ing sessions lasted from 10 to 20 minutes and dis- 
cussion periods ranged between 15 and 25 minutes. 
(The leaders were instructed to minimize group de- 
cision with regard to the “real reasons for the em- 
ployee’s behavior.”) No attempt was made to pre- 
sent a single best action alternative to the case. 
The twenty foremen in the second treatme 
were similarly divided into five-man subgroups. 
These participants were assembled in a separate 


room and shown the sound slide film Case of the 
Reluctant Electrician. Role 


were given as described above, 
and discussed this second case 0} 
acteristics and time allotment 
those mentioned above, 


nt group 


Position assignments 
The men role played 
nly. Discussion char- 
S were identical with 


er scal- 
ibed. The reliabilities 
signed by four expert 
the Employee-orienta- 


Study 2, Participants in th 
taken from a group of sch 
a campus conference sessi 
randomly selected for rol 
men were taken asi 
instructions for th e Stubborn Employee 
(Maier, 1952); the other 13 y i t 
worker’s role, 


e second study were 
ool custodians meeting in 
on. Twenty-six Ss were 


The Case of the Reddened Ey 
to the participants both before the role playing and 
immediately after. Responses to the Sensitivity and 
Employee-orientation questions were obtained after 
each administration, and were identified according to 
the role position of the respondent, The open-end 
responses were scaled by the abbreviated master 
scaling procedure. The reliabilities of the average 


es was administered 


C. H. Lawshe, R. A. Bolda, and R. L. Brune 


nd 
scores assigned by four expert judges were i 
.93 for the Sensitivity and Employee-orienta! 
sponses, respectively. a 
Stud 3. The participants were 27 staff pis in 
visors in an educational institution pa re 
a 13-week management training program. to this 
perimenters made two-hour presentations ike fist 
group during two successive weeks. Oe aul into 
week, the participants were randomly nN the Case 
four- and five-man subgroups to role play a aceite 
of the Stubborn Employee. Role positio Study I: 
ments were made in the manner described in 3 about 
Only the worker role players were tT ead to 
the fear of heights. Role playing was a the skits. 
continue until all subgroups had completed roblem 
At that time, the leader revealed the height < 
and encouraged group discussion of the T afinding 
attempted to emphasize the importance 0 
out why.” 


a 
week later, 


During the second presentation a the same 


tape recording of a role playing session oa to inte- 
case was presented, and the leader attempte' cial pe- 
grate this action with lecture material on Som forma- 
ception. The consequences of incomplet At the 
tion on supervisory actions were emphasize Reddened 
end of the second period, the Case of the 

Eyes was shown and responses to the iwo id 
questions were obtained, It was possible to | 

respondents according to their role position 
the first week. No premeasure was taken 
time requirements did not permit. b 

The open-end responses were scaled by ia 
breviated master scaling procedure. The ie 
of the average scores assigned by four ot 
-93 and .96 for the Sensitivity and Employe 
tion scales, respectively, 

Study 4. The participants were 16 m > 
staff supervisors participating in a compart 0 
ment development program. At the ster 
first period, they were shown the Simat Ph 
of the Reddened Eyes, and were asked to V 
responses to the criterion questions. i 

The open-end responses were subsequen 
on the Employec-orientation and Sensitivity 
The supervisors were divided randomly ‘ach grouP 
man groups and one pair of trainees in © rker roles 
was asked to assume the foreman and ee pe 
in a role playing skit on the criterion Aar the 10, 
group members acted as observers. A the exl hat 
playing demonstrations were completed, mp ” 
menter led a discussion of reasons for the am e 
behavior, and similarities between the cae ead H 
own experiences, No attempt was made U 


jenta- 


d 
ale line an 
vay manage 


ne Case 
ite theif 


ly scaled 
continu 
to four” 


either or 
trainees to group decision with respect tO stance be 
the “real” reasons for the employee’s T hould a 
(b) the “best” way to handle the case- 5 


sag a 
cetinE “nt 
Pointed out that the trainees had been ™ me 


ere ye 
group for several months prior to the ee tai 
and had participated in many discussion- K posh 
ing experiences. For this reason, it Was cision wo 
to steer the group away from general ery atte™ 
respect to the “Why” factor, although eve 
was made to minimize this. 


Studies in Management 


During the three subsequent weeks, the trainees 

were presented with other training cases for role 
playing and discussion. In every case, the post-role- 
playing discussion periods were handled in the man- 
ner outlined above. On the fifth week, the Case of 
the Reddened Eyes was readministered, and responses 
to the two stimulus questions were obtained. In ad- 
dition, the participants completed a role playing 
evaluation questionnaire (Lawshe, Brune, & Bolda, 
1958). After the final session, the open-end re- 
sponses from the first and fifth weeks were pooled 
and scaled by a forced-sort scaling procedure. 
_ Study 5. The Ss were 29 supervisors from a va- 
riety of industries enrolled in an institutional five- 
week Human Relations Training course. Fourteen 
Persons attended a morning session; 15 were enrolled 
in an afternoon session. The same instructor pre- 
sented the same material in both two-hour sessions. 
The morning participants became the experimental 
group and were available to the experimenters for 
the first hour of each week. The afternoon trainees 
Were designated the control group. 

At the first session, both groups were shown the 
criterion film strip and wrote their responses to the 
Sensitivity and Employee-orientation questions. In 
addition, they were asked to indicate why they de- 
cided upon a particular course of action by marking 
One item in the following check list: 


a. ( ) Convince the girls that the lights were O.K. 
b. ( ) Patch up Joan’s hurt feelings. 


c. ( ) Show the girls that you are still the boss. 

d. ( ) Assure Joan that she is important. 

c. ( ) Find out if the lights are harmful to the 
eyes. , 

f. ( ) Make Joan an example to the other girls. 


The check list forms the basis of an “Intentions” 
Classification. The experimenters then presented the 


Training Evaluation: II 289 


Case of the Stubborn Employee (Maier, 1952) to 
the experimental group for role playing. 

During the three subsequent training sessions, cases 
were presented for role playing which illustrated both 
the “Why” aspects of human relations, and also the 
“What to do” factors. Discussion periods after role 
playing sessions were directed so as to emphasize 
both the importance of finding out “Why” and the 
importance of selecting an appropriate course of su- 
pervisory action. 

The control group simply discussed the technical 
aspects of the Case of the Reddened Eyes for sev- 
eral minutes; the experimenters then left the room 
and did not reappear until the fifth week. 

At the fifth session, both groups were again shown 
the Case of the Reddened Eyes. Sensitivity, Em- 
ployee-orientation and Intentions responses were ob- 
tained. In addition, experimental group members 
completed a 17-item role playing questionnaire. 

Sensitivity and Em ployee-orientation responses were 
scored by four judges using the abbreviated master 
scaling method. The reliabilities of the average 
scores were .96 and .95, respectively. Responses to 
the Intentions check list were treated as classifica- 


tion frequencies. 
Results 


Single exposure studies. Analyses of vari- 
ance were applied to the Sensitivity and Em- 
ployee-orientation scale scores obtained in 
Study 1. None of the main effects of inter- 
actions was found to be significant. The pre- 
and postposition and treatment means are 
shown in the top section of Table 1. 

Simple ¢ tests were applied to the differ- 
ences between pre- and postmeasures on fore- 


Table 1 


Pre- and Post 


response Mean Scores in Single Exposure Studies 


Sensitivity Employee-orientation 


Scale Scale 
y Post 
Study Role Position N Pre Post Pre Si 
7. 3.0 52.9 45.6 
S 7 8 413 4 
MD Genee eee o 
46. i 
Foremen identifiers : ae es a oo 
yorker inea 8 37.3 35.4 50.7 35.6 
5 31.0 36.0 32.4 37.8 
ork 
66.3 56.5 
Study 3 Foremen 6 p 6 
Workers 5 tae py 
Foremen identifiers 3 = = 
Worker identifiers 3 T ei 
Observers 3 3 i 


290 


C. H. Lawshe, R. A. Bolda, and R. L. Brune 


Table 2 


Pre- and Postresponse Mean Scores in Repeated Exposure Aui 


Sens. Scale Employee-orientation 
or Tc E Vk. 5 
Stud: sri Wap” N Wk. 1 Wk. 5 Wk. 1 V 
tudy = i E 
3 46.6 
Q 4 52.0 63.1 60.8 par 
e eke 4 53.3 69.1 61.9 bs: 
Observer 8 41.5 57.6 54.9 ae 
<n 2 54.9 
Study 5 (All Control Members) 15 41.2 40.9 ao Ta 
y (All Exptl. Members) 12 48.2 53.9 47.8 ‘ea 
Foreman 6 40.8 48.2 51.8 pa 
Workers 6 57.0 57.8 45.7 ee. 


man and worker role players in Study 2. The 
Employee-orientation increase for the fore- 
man role players (from 32.4 to 37.8) achieved 
significance (¢= 2.00). Foreman role play- 
ers’ scores on Sensitivity also increased (NS), 
while worker role players’ scores were not sig- 
nificantly different on either measure. Re- 
sponse means for Study 2 are shown in the 
center section of Table 1. 

One-way classification analyses of variance 
were applied to the data obtained in Study 3, 
treating role position as the variable of classi- 
fication. Neither Sensitivity nor Employee- 
orientation means scores differed significantly 
by role position on the 


basis of the over-all 
F test. The response means are shown in the 


lower section of Table 1. Since the experi- 
menters had decided a priori to test for dif- 
ferences between foreman and worker role 
player means, simple ¢ tests were applied to 
these data on both measures, While foreman 
role players tended to give better responses 
on both scales, only the difference on the 
Sensitivity measure was significant (t= 2.18), 

Repeated exposure studies. The over-all 
pre- and posttraining Sensitivity mean scores 
(first and fifth week Tesponses) in Study 4 
were 47.0 and 63.0, respectively. The dif- 
ference between these means js significant 
(t = 2.44). That the shift was characteristic 
of all Ss, rather than being differentiated ac- 
cording to the S’s role position on the first 
week, is disclosed in Table 2; 

The Employee-orientation means for all 
persons were 58.0 on the first week and 50.4 
for the fifth. This difference is not signifi- 
cant. A comparison between means of per- 


sons who had assumed various of the ert 
sitions on the first week did not gue re- 
differential effects on this variables. 
sponse means are shown in Table 2. 
Unweighted means analyses of varian! ee-ori- 
applied to the Sensitivity and Employ in the 
entation scores obtained in Study 5 pares 
control and experimental groups. These cote 
ses revealed that the groups were hom asa 
ous on the first week with respect to 5 
on both criterion variables. oup X 
However, an examination of the Gt speri- 
Weeks interaction revealed that the = 
mental group improved significantly ] group 
Sensitivity dimension while the contro a 
did not. A similar analysis applied disclose 
Employee-orientation data failed to group: 
significant improvements for either w that 
The response means in Table 2 sho rove 
members of the experimental group nF emia 
somewhat on this variable while gee 
bers maintained their initial score ie ‘peel 
The experimental group members on the 
either worker or foreman role play geek 1. 
Case of the Stubborn Employee se in the 
The means of these subgroups lites ro 
lower part of Table 2. The toren niitit 
players significantly improved TE mean 
ity performance (p < .01). No ot 
differences achieved significance. - fist items 
Regarding Intentions, the check ag. MG 
(b) and (d) represent “good” respons¢ her: 
remaining items are designated on ó 
The frequencies in Table 3 are nee 
persons who initially indicated “° ro er 
sponses. The table shows the numbe «thet 
sons within the groups who repeate 


ce were 


4 


Studies in Management 


Table 3 
Intention Changes between Weeks 1 and 5 
(Study 5) 
No.of Persons No. of Persons 
Staying Shifting 
x in “Other” to “Good” 
Group Category Category 
Experimental 4 4 
ontrol 7 0 


responses or changed to “good? responses. 
The Fisher exact probability test (Siegel, 
1956) yielded a p = .048 (one-tailed). The 
More favorable shift of Intention response oc- 
curred among the experimental Ss. 


Discussion and Conclusions 


The pattern of results obtained in these five 
Studies suggests the tenability of postulating 
impact as a factor in effecting change in hu- 
Man relations dimensions. Impact is here de- 
fined as a characteristic of a training experi- 
ence which (a) allows the trainee to criticize 

1S own performance in human relations tasks, 
(b) provides an adequate type of feedback to 
the trainee regarding his performances, and 
(c) serves to emphasize a particular human 
relations factor in a strong, emotional man- 
ner. The authors interpret these results as 
Indicating the absence of impact in Study 1. 
n that study, participants were allowed to 
Tole play and discuss the cases without direct 
Indication of the “real adequacy” of their 
Performances and perceptions. As a result, 
© response changes on the criterion case were 

Noted, 

The impact factor, however, was present in 

tudies 2, 3, 4, and 5. The first two groups 

tole played only the Case of the Stubborn 

™bloyee, Tt has been the author’s experi- 
a that only about 5 to 10% of the fore- 
it an role players discover the height problem 

l this case. Consequently, many foremen 
ayers resolve the problem situation by dis- 
arging, transferring or otherwise taking ag- 
‘essive action against the “stubborn” em- 

Shor The revelation that there is a real 
rip ative factor behind the reluctance con- 

utes to an impact experience. The effects 
tee experience seem to have generalized to 

Proved performance on a second training 


Training Evaluation: II 291 
case, the Case of the Reddened Eyes. It 
should be noted that impact, as evidenced 
by significant score improvement, occurred in 
Study 2, Study 3, and Study 5 for only the 
foreman role players, in two instances on the 
Sensitivity dimension and in the third on Em- 
ployee-orientation. 

The effects of repeated exposures to role 
playing evidenced in both Study 4 and Study 
5 are reflected in increased Sensitivity scores. 
In Study 4 it was found that a standard train- 
ing program did not effect these changes. It 
was proposed that only the participant who 
is placed in a problem-solving situation and 
is made to realize the inadequacy of his initial 
response is likely to benefit from the experi- 
ence. The foreman role players in the Case 
of the Stubborn Employee (Study 5) had this 
opportunity. The fact that Sensitivity im- 
provements in Study 4 were not differentiated 
according to role positions on the first week 
indicates that the impact experience was a 
general one for all participants, and that the 
impact may very well have occurred as a re- 
sult of the type of discussion which followed 
role playing. 

Neither of the repeated exposures pre-post 
Employee-orientation comparisons resulted in 
rejection of the null hypothesis. Although 
the trend in Study 5 is in the direction pre- 
dicted, the “error” variability is sufficiently 
large to negate the mean score improvement. 
The authors suggest that the observed im- 
provement, however, is a reflection of (a) the 
types of cases presented, and (b) the types of 

ost-role-playing discussions held. 

The lack of Employee-orientation change in 
Study 4 is not easily explained. Although the 
experimenters attempted to lead a “Why”- 
oriented discussion, the participants succeeded 
in bringing the “What to do” element into 
the picture. This discussion is not reflected 
in scale score shifts. Of all the possible ex- 
planations for this phenomenon, the authors 
prefer the following: The trainees were all 
members of one management organization. 
To the extent that organizational patterns of 
“accepted” behavior are present, training can 
be expected to have little effect on action re- 
sponses. If these managers have a common 
attitude toward employee problem situations, 
and a homogeneous method for classifying 


292 


these problems, abbreviated training would be 
relatively ineffective in altering these factors. 

In Study 5, there is some evidence that 
Employee-orientation scores and Intentions 
are differentially affected. 

The results cited here suggest the follow- 
ing conclusions with regard to exposures to 
role playing: 


1. Experiences with the skit-completion 
method of role playing are effective in pro- 
ducing Sensitivity and Employee-orientation 
changes to the extent that impact occurs in 
the training. 

2. The beneficial effects of such an experi- 
ence are capable of transferring to a second 
or novel human relations problem situation. 

3. Repeated exposures to role playing as 
administered in these studies, contribute little 
to criterion response improvements, 

In both repeated exposure studies, signifi- 
cant Sensitivity shifts occurred where impact 
was involved on the first week. There is 
some indication that case selection and type 
of discussion can increase Employee-orienta- 
tion. 

4. Impact experiences may occur either as 
a result of the type of case used, or as a re- 
sult of the type of discussion held after role 
playing. 

5. Improved Sensitivit 
necessarily accompanied 
orientation responses, 


y responses are not 
by better Employee- 


gest that the “What 
emphasized througho 
phenomena observed 
due to the relative q 
ployee-orientation. 


C. H. Lawshe, R. A. Bolda, and R. L. Brune 


Summary 


Five studies were conducted to evaluate T 
effects of single and repeated ED 
the skit-completion method of role piyee 
Evaluation criteria consisted of scaled Fi 
sponses to a standard human relations me 
ing case in two dimensions: Sey ars 
Employee-orientation. Criterion resp ing 
were obtained before and after role pias 
in four subject groups, and after the ait: 
in a fifth group. Various role playing ats 
ment conditions and role assignments k in 
investigated. It was found that enmen iy 
criterion case responses were effected in gae 
those instances where impact occurred in Im- 
nection with the training pene and 
pact is effected by (a) case materials se 
(b) the type of discussion held ae as 
playing. In addition, in those cases ie f 
the impact factor was evident, the e 
this experience were capable of genera a: 
to performances on a second training Ca ase 

Within the comparative limitations ts 
by the present research procedures, EP att 
exposures to role playing showed little Z 
tage over the single, impact expen en yeg 
was also found that Sensitivity and Emp 


: : ially €t 
orientation improvements are differentially 
fected. 


Received October 20, 1958. 


REFERENCES 
Lawsner, C. H., Boroa, R. A. & BaN an 
Studies in management training aie c 
Scaling responses to human relations train! 
J. appl. Psychol., 1958, 42, 396-398. 
AWSHE, C. H, Brung, R. L, & BORDA 
What supervisors say about role playing. 
Soc. Train., Directors, 1958, 12(8), me ns. New 
Mater, N. R. F. Principles of human relatio : 
York: Wiley, 1952. se pv work 
Stece1, S, Nonparametric statistics. 
McGraw-Hill, 1956. 


Ry L 


ases. 


R. A 
L _ Amer 


oy 


d 


1) 


Journal of Appli x 
Vole 43, Mo Ar pop rees 


FOLLOW-UP ON THE VALIDTY OF A FORCED-CHOICE 
STUDY ACTIVITY QUESTIONNAIRE IN ANOTHER 
SETTING 


HOWARD MAHER 


University of Pennsylvania 


Schutter and Maher (1956) have previ- 
ously reported in this Journal the validation 
and cross-validation of a forced-choice study 
activity questionnaire. It was of interest to 
the present author to see if the instrument, 
Validated and cross-validated on a state col- 
on Population, would carry over its validity 
0 another, private university. Also, in the 
Construction of the questionnaire, as described 
in the original article, the instrument was de- 
Signed to have no correlation with scholastic 
aptitude test scores, thus to add more greatly 
to the multiple prediction of grade-point av- 
Page It is of importance to see if this rela- 
‘onship will hold using another test of scho- 
astic aptitude and with another sample. 

i Moreover, the population on which the test 
ae originally standardized was composed 
ee of freshmen and sophomores. The pres- 
T sample is composed of all classes save 
Sag and one may be interested in seeing 
in ether the original finding of no differences 
Sa class scores holds for this sample. If so, 
cravate norms would not be required for use 
ùth the different undergraduate classes. 


Procedure 


ean Ss used were 189 sophomore, junior, and 
introd students of finance and commerce taking an 
askeq uctory psychology course. Each student was 
(Sch; to fill out the 30 block forced-choice scale 
E utter & Maher, 1956), using IBM answer sheets. 
ee Were machine scored, using the weights of 
al three to plus three as established on the origi- 
owa State College (IOWA) validation groups. 
ite the same Ss there were also available College 
Te e Examination Board, Scholastic Aptitude 
erage (SAT) scores and accumulated grade-point av- 
the L (GPA). Since the ACE-L score was used in 
Comp, S. C. investigation, it was decided to make for 
sing op bility of the scholastic aptitude measures by 
0 the SAT verbal scores (V). q 
answer the validity and intercorrelation ques- 
S (above), the Pearson product-moment coeffi- 


S Were computed among scores on the study ac- 
and GPA. Also, 


bined to obtain 


n 


tion, 

cien, 

tivit 

the 7, @uestionnaire (SAQ), SAT-V, 
ro-order coefficients were com 


29. 


the multiple prediction of GPA. Finally, ¢ tests 
were run among the mean scores of the classes to 
answer the third question above. 


Results 


Parres (1955), in an investigation of the 
SAT at this university, reported all finance 
and commerce students to have mean SAT-V 
scores of 496.7 (SD = 84.19). For the pres- 
ent sample, the mean is 499.7 (SD = 80.06). 
The difference is not statistically significant 
(t = 43), and the chosen sample would ap- 
pear comparable to the pertinent population. 

Table 1 shows the validities of SAQ and 
SAT-V and the intercorrelation between SAQ 
and SAT-V. (All are Pearson product-mo- 
ment coefficients.) Both SAQ and SAT-V 
contribute significant prediction of GPA (an 
r of .21 is significant at the .01 level). 

When the two predictor zero-order correla- 
tions are combined Rı.23 = .62 where: 


1=GPA 
2 = SAQ 
3 = SAT-V 


The question as to whether SAQ scores 
differ significantly among sophomores, jun- 
iors, and seniors was handled by computing 
the ¢ tests among scores for these classes. 
None of the #’s in Table 2 reaches the 5% 


level of significance. 


Table 1 


Correlations among the Predictor Variables and the 
Criterion of Grade-Point Average and Means and 
Standard Deviations of the Three Variables 


SAT-V GPA M SD 
SAQ 08 48 2.18 13.03 
SAT-V 40 499.7 80.06 
GPA 3.28 74 


a The possible raw scores are from —41 to 38. 
b The highest possible GPA is 5.0. 


Table 2 


] f SAQ Scores by Class and Significance of 
S E of SAQ Class Score Means 


M N t 
-1 78 
So 3 50 
Te 2.2 76 
2.2 76 
r. 2.2 
J 1.07 
Sr. =1.0 35 
So. Si 78 
1.40 
Sr. —1.0 35 
Discussion 


The items of the SAQ were originally se- 
lected by comparing the responses of groups 
matched for scholastic aptitude (ACE-L 
score). The high criterion group’s members 
(overachievers) were, however, at least seven- 
tenths of a standard error of estimate above 
the regression line to predict GPA from ACE- 
L. The low group was selected fro; 
equally removed fr 
sion line. 


m the area 
om and below the regres- 


ake for a 
SAQ and 
two vali- 
he IOWA 
ation sam- 
CE-L) was 
at the time 
‘stacked the 


(Schutter & 
Pparently is not 
at for this (different) 
a different (although 


so. Table 1 shows th 
sample and even for 
likely comparable) scholastic aptitude test the 
intercorrelation (r = .03) remains not signifi- 
cantly different from r= .00. The planned for 
relationship is thus demonstrated to hold up. 

While SAQ'’s items were not selected to re- 
flect no differences among classes, the origi- 


Howard Maher 


nal finding of no significant score rag 
between freshmen and sophomores is soen a 
hold here in another institution and for a 
ferent classes, e.g., sophomores, juniors, SAQ 
seniors (Table 2). Although mean i 
scores drop off from the sophomore Si be 
senior level, none of the differences a 
considered significant. It may be that i ws 
are differences in study activities em p 
dergraduate classes. If so, the test B are 
reflect them, and, until such ee a 
demonstrated, the test may be riage vels 
suitable for use at all undergraduate ats 
without the necessity of providing sepa 
norms. : at 
The paramount consideration wee 
course, be that of validity. Since ‘babili. 
was keyed originally on compound pre sam- 
ties (Baker, 1952) from two eos ee e, 
ples and cross validated on a third ae no 
and since the original key was used vai 
need was seen for the use of two (vali resent 
and cross-validation) samples in the P ross" 
study. This procedure represents @ shows 
institutional cross-validation. Table 1 test is 
a comforting situation indeed, i.e., ae 
valid in this different setting. The nt 7 is 
cross-validation r was .36. The ei to be 
48. At first glance this would appea ntly iS 
surprising, i.e., the repeat r app ar idation 
higher than the original mAT leve 
However, the fiducial limits at the e 
for the original 7 of .36 are .18 to ‘a these 
the repeat r of .48 is seen to be withit 
limits. ‘natio 
In the TOWA setting, the combinatio? 
SAQ and ACE-L predicted GPA with 
The better predictor in that inaen n A 
ACE-L, but SAQ raised the prediction, 
over ACE-L alone. In the present si ear 
however, the better predictor would a rs be 
be SAQ (Table 1). Perhaps this 0¢ ror the 
cause SAT is used to select students 1 ore 
university. The multiple are multi- 
found to be .62 with SAT-V raising t ation 
ple prediction by 14. The agen while 
SAQ with the scholastic aptitude tes ntribut? 
one or the other apparently may pe 1 
more heavily in the different ae ip 
sults in equally high multiple predi 
the two institutions. 


of 


n of 


uatio™ 


i 


> 


>i 
< 


Validity of Activity Questionnaire in Another Setting 295 


Summary 


This study investigates the possibility that 
(unknown) institutional differences might af- 
fect the validity of a forced-choice study ac- 
tivity questionnaire (SAQ) validated at a 
State college and then applied in a private 
university. The findings are as follows: 


1. The cross-institutional cross-validation is 
48, significantly higher than r= .00. The 
Original cross-validation was 7 = .36. 

_2. In one setting (state college), the more 
Significant predictor is the scholastic aptitude 
test; in the other SAQ, possibly in the latter 
as a function of selection of students using 
the scholastic aptitude test. The multiple pre- 
diction in the former situation was R = .53; 
in the latter .62. Thus SAQ is significantly 
valid and, together with the scholastic apti- 
tude test, gives significant multiple prediction 


5 
of grade-point average in both situations. 


3. The study questionnaire, designed origi- 
nally to have no correlation with scholastic 
aptitude score, holds its sought-for lack of re- 
lationship in this instance also. 

4. The original finding of no difference be- 
tween freshman and sophomore SAQ scores is 
repeated here among sophomore, junior, and 
senior students. 


Received November 13, 1958. 


References 


Baker, P. C. Combining tests of significance in 
cross validation. Educ. psychol. Measmt., 1952, 
12, 300-306. 

Parres, J. G. Prediction of success in the under- 
graduate schools of the University of Pennsyl- 
vania. Unpublished doctoral dissertation, Univer. 
of Pennsylvania, 1955. 

Schutter, Genevieve, & Maher, H. Predicting grade- 
point average with a forced-choice study activity 
questionnaire. J. appl. Psychol, 1956, 40, 253- 


257. 


1 of Applied Psychology 
es Roe Toso 


m RDS AS MULTIPLE- 

E USE OF IBM MARK-SENSE CA 

= CHOICE PAPER-AND-PENCIL TEST ANSWER 
FORMS ' 


ERNEST MADRIL? 


Personnel Laboratory, Wright Air Development Center 


Recent rapid advances in the field of elec- 
tronic computer technology have made most 
obvious the desirability of employing punch 
cards to record examinees’ responses to test 
items. The conventional methods of key 
punching test response data bottleneck and 
restrict the volume of test information that 
can be fruitfully employed in psychological 
testing. This paper is intended as a report of 
the experiences of the Personnel Laboratory, 
Wright Air Development Center in the use, 
on a world-wide scale, of mark-sense cards as 
multiple-choice answer forms. Within recent 
years several studies have been made relating 
to the use of mark-sense cards in the admin- 
istration of tests. Deemer (1948) discussed 
the use of mark-sensing in large-scale testing 
from the point of view of centralized admin- 
istration and control of such test administra- 
tion. He was concerned Principally with the 
mark-sensing of test scores achieved by the 
examinees. Appel and Cooper (1955) dis- 
cussed the use of mark-sense cards as test 
answer sheets, but again from the point of 
view of centralized control and administration 
of psychological tests. A Statistical report 
Prepared by the Educational Testing Service 
(1956) indicated that the test performance 
of examinees using conventional answer sheets 
did not differ significantly from the test per- 
formance of examinees using a new type an- 
swer card similar in size to the familiar IBM 
punched card. 

The United States Air Force 
assessing its airmen’s on-the-job Proficiency 
began in 1952 and continues as a decentral- 
ized test administration, and a centralized 


Program of 


1 This report results from work done under ARDC 
Project 7717, Task 17131, sponsored by the Person- 
nel Laboratory, Directorate of Laboratories, Wright 
Air Development Center. a 

2 The invaluable assistance given by Wiliam S, 
Harris in planning and programing js gratefully 
acknowledged. 


scoring, test analysis, and research promn 
(Gilhooley, 1956). Currently the Perso. ai 
Laboratory employs about 400 erent 
Proficiency Tests to classify approxima Ap- 
200,000 airmen by skill levels yearly. ees 
proximately one-third of this block of paps 
is administered each month at an ~«: ree- 
800 different installations, during a i of 
month cycle of testing. Four such cyc oe 
testing are conducted yearly on an Air nl 
wide scale. Personnel classification red dic- 
ments, accounting, and public relations the 
tate that test results be made available to ible 
interested activities with the least eee 
delay. Initially, conventional IBM test $ ae 
ing machines and desk calculators wor wally 
in test processing and evaluation. Gra re 
techniques requiring the use of IBM one y 
ing equipment were developed and adon at 
Presently, IBM mark-sense cards, in a the 
form as shown in Fig. 1, are employed i The 
Airman Proficiency Test Program. item 
mark-sense positions for the possible those 
responses are in letter form, such as Burke 
Suggested by Morton, Hoyt, and 
(1955). ae 

The mechanics of the centralized test Pis 
essing portion of the program follow 18 
sequence: 


1. Receipt and logging of test oe 

2. Reproduction of mark-sensed cat ident! 
key punching of selected information } 
fying the examinee, 

3. Editing of reproduced cards and 
ordering of cards by test identification: oo. 

4. Test scoring and preparation ® istics: 
quency distributions and summary st@ s an 

5. Linear transmutation of test score ta- 
preparation of comparative summary 
tistics. 


and 


general 
ej 


by 

ilts 

6. Preparation of rosters of test rest 

testing installation, ann ee 
7. Mailing of test results to origina 


296 


zi) 


ys 


ed 


IBM Mark-Sense Cards as Answer Forms 


tivities and summaries of test results to Head- 
quarters, United States Air Force and to the 
respective major air commands such as the 
Strategic Air Command and Air Defense 
Command. ” 


_ Test item analyses, research, and evalua- 
tion, together with test booklet distribution 
and the dissemination of test administration 
Instructions, overlap the operations listed 
above. 

Completing the cycle of operations required 
an average of seven weeks during the early 
Stages of the program. Presently, through 
the medium of mark-sense cards, conventional 
BM type electrical accounting equipment, 
and an IBM Type 650 electronic data proc- 
essing system, the average time is only 28 
days. The use of an electronic computer has 
reduced significantly the manual labor usu- 
ally involved in large-scale test processing, 
and the operating personnel requirements for 
= scoring and analysis have been slashed 
Y better than 50%. 

An examination of Fig 1. reveals that the 
&xaminee identifying Form Card 1 is flexible 
and permits the entry of varying types of 
Personal data by the examinee, which can 
“ther be key punched or mark-sensed in 
columns reserved on Card 2. Card 2 also 
Provides for the mark-sense entry of certain 
tecntifying information which is required for 
Veh ee and statistical analysis. The re- 
De side of Card 2 and both sides of Card 3 
rt four response positions each for a 
ad of 150 test items. The item response 
sp: ds also contain two rows each upon which 
in aces have been reserved for centralized edit- 

8 and coding. These positions are employed 
hen examinees omit responses to test items. 
en omissions are arbitrarily coded as wrong 
sponses, These “reserved” mark-sense post 
a can be used as a fifth alternative re- 
Lonse position by a minor modification of the 
ard design, 
ieie cards now used by the Airn T 
rat cy Test Branch of the Personnel 1abo- 
Stan are intended for unspeeded tests, but 
as ilar cards are adaptable for speeded tests 
Ca well. Space for more than 150 test items 

ti be Provided by adding the necessary cards 

the test booklet or by using additional 


nan Pro- 


297 


booklets. All cards of the booklet are pre- 
numbered and prepunched with the same 
7-digit serial number and a 1-digit card num- 
ber. This facilitates collating operations dur- 
ing mark-sense reproduction and later in test 
scoring and analysis. 

Punching of mark-sensed identifying in- 
formation and test answers is accomplished 
through an IBM Type 519 document origi- 
nating machine equipped with a half-time 
emitter, twenty-seven 12-4 and 5-9 column 
splits, and a 60-column multiple punch blank 
column detection device, and a punch offset 
stacker. The column splits and half-time 
emitter enable the offset reproduction of the 
first row of 25 item responses from the second 
row of each block of 50 items. This provides 
for a single answer punch in each of 50 card 
columns of the reproduced cards (Fig. 2). 
Each reproduced card also carries the control 
number, card number, test identity, and a 
5-digit code identifying the installations to 
which the results are to be remitted. Three 
test answer cards are produced from the two 
mark-sensed cards. Each card reflects the 
responses of the examinee to each of 50 test 
questions. As reproduction of the mark- 
sensed answer cards proceeds, selected identi- 
fying information from Card 1 is key punched. 
This step is required only because the Air 
Force needs to identify examinees by name as 
well as service number. Were alphabetic in- 
formation not required, or if it could be trans- 
ferred to punched card form by optical or 
similar electrically operated reading devices, 
key punching could be eliminated for the 
Airman Proficiency Test Program. 

Test scoring and the preparation of fre- 
quency distribution tables and summary sta- 
tistics, singly for the given administration 
period and cumulatively for previous admin- 
istrations of the same editions of the tests, 
are accomplished on the IBM Type 650 di- 
rectly from the mark-sensed reproduced cards 
at the rate of 1,800 raw scores an hour. The 
numerous scoring keys for the many tests are 
stored on the IBM Type 355 RAMAC. At- 
tained raw scores and test identification are 
stored, during the scoring operation, on mag- 
netic tape using IBM Type 727 tape drives. 
The IBM Type On-line 407 is employed dur- 


ing scoring to record selected identifying in- 


298 Ernest Madril 


PRT 850 


UPRINT ALL inromwaTion) AIR FORCE PERSONNEL TEST PROGRAUS (PRINT ALL INFORMATION) 0446 
oe Sà ioen ving mromaatron grase avendl 
TAST naut FIRST NAU e a — [PRE Tia FOREE SERVICE WUUBEA 
i ; 
1 
ath 
ME TEST pooner coves ron muanty vareenmon | 
a. BOOKLET cory maota 5. TEST AFSC Sins’ [FY 7 0098 Tar pat eortiom DATE g TODAYS DATE wae O 
Day I MONTH q YEAR Oar 1 MONTH an 
' ' ' reuse O l 
i 1 i 3 
(FILL IM ITEMS 11 ANO 12 ONLY WHEN INSTRUCTED BY TCO) (CXAMINEES 00 MOT WRITE IN ITEMS 19,14,AND 15) 4 
PRIMARY AFSC ANO SWAEOOUT 370 | — DUTY APSE AO SoMEBOUT 
USE SPACES OZLOW ONLY Ee SPECIAL INSTRUCTIONS 
ie pave oF onTH 
ay 
0><0><N><yacyscpsctacla<02<02<0> 
T oeeaaeaii 
22022222C? >c2>c2>c2>c?>c2>C22C22C22 
$2¢99932633¢33¢§Jacjac}3c33c33C33¢3903 263? 
4>c4>c4>c4>c4>c4>c4>c4 2424242 
> 
S2chachac§ach sch schac§ cH 25269 
B2cb>cb>GacGocgrcgocgacpocgochochI2C62—67 
TSST>C1SE] 3¢]3¢13¢] 3c] 3c] ac] >cT2<T2eI2eI? 
>c9>c9>c9 Haci 
9 I>CJ>c9>c! 
= $} c9>c9>c9>c9 
‘ 
8 
PES88qg|n qq, : 
Fee e sre eae ac op | : 
ammm amman : 
£ 
Se Seee ree || 3 
ew FR fF 3 2 8 = = [4 
a z 3 8 
PAES0R)gR03) 900 qq 4) i 
| | i | oe | ee | a Joe 3e5¢ k | 4 
ammmon j 
Cocco oc 2 
mr s £ 
3 \ 
Le 


Fic. 1. Mark-sense test answer booklet, 


IBM Mark-Sense Cards as Answer Forms 299 


DWhoocooogocococcc000g§ 

00600000000000900000000000000000000000 

' rice ititannniunnnaanannnagsnsnanasnaceowocounusvunnatanausanzusunnsyniphnine 
ERRET DRRREEE | LI DRRRUOREREOPEREROREOREOEOG! LEEREEESEEI EEI LESERT | EEI ESEE] 


2222222222§M222222222222222222222222222222222222222 20H 222022222220 2022220 2222222 
003303303333333333 


33333333333333333333333333323333333333333333333333333N333 
rer) PLCeeeUPEVPP TET TE) PURPVUVECTCCRCCOSOOCOOCECEOS) MAAMA 
SSSMSSSSSSSSHSHSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSS 
66666666666666666666666666M66666M6666M666M66M666AME66666666666666666666666666666 
B77017010 077777777777777777777771777771117 


111711777177717711711117077 77? 


Fic. 2. Reproduced test answer card. 


formation and raw scores of examinees attain- 

& unexpectedly low or high scores, and for 
the listing of frequency distributions which 
are subdivided into as many as five sub- 
oon of examinees by test. The sum- 

ty statistics are punched to provide for 
Variable listing as required. 

In test scoring, the examinee’s answers are 
Positioned on the lower accumulator—a 10- 
digit word at a time—so that when the “key” 
vord is added algebraically, the correct re- 
Sponses selected by the examinee are con- 
verted to the digit 5. This is followed with 
e treatment of each item response indi- 
idually, thus: A digit 5 is added into the 
igh order position of the lower accumulator 
Which causes the value of that item response, 
up orrect, to equal 10 and carry into the 
Pper accumulator. The high order digit in 
ai accumulator becomes zero. By re- 

ing the upper accumulator, shifting the 
rabid accumulator left one position, and test- 
ae on the upper accumulator for non-zero, 
ites machine can determine whether or not the 
a is correct. If the upper accumulator is 
“ero, then that item response can be identi- 
seh as correct. An accumulation by tally of 
uch right responses during iteration of the 
aa ee routine, for as many items and words 
pr Necessary, will yield a total score. This 
ate permits differential item weighting 

selected item scoring through programing. 

Ollowing the test scoring operation, a table 

the statistical factors (required for linear 


transmutation of scores) and the program are 
loaded into general storage. The key punched 
identification cards (taken from Card 1) are 
read, test identification and score records 
are taken from tape and collated with the 
identification card data. The raw scores are 
transmuted, frequency counts by variable 
class intervals are prepared by test and other 
categories, a master tape record is written, 
and master cards are punched at the rate of 
6,000 standard scores an hour. As the proc- 
essing of each test is completed, comparative 
frequency counts by variable class interval 
and by major air command are prepared and 
listed through the On-line 407. 

The total operating time for test scoring, 
preparation of frequency distributions, statis- 
tical summarizing, production of test record 
cards and master tape records on the IBM 
650 tape RAMAC system for approximately 
16,000 examinees is in the range of 13 to 18 
machine hours. One operator needs to be 
present in the machine room, but only for 
insuring that cards are fed into the IBM Type 
533 card reader and that the tape drives are 
properly selected and set. This may be con- 
trasted with conventional test scoring, key 
punching, and preparation of frequency dis- 
tributions employing the usual electrical ac- 
counting machinery, desk calculators, etc., for 
statistical purposes. The time and manpower 
differences are obvious. 

The only errors which have been experi- 
enced in test scoring in the four months dur- 


300 


Table 1 


Test Performance of Examinees Using Conventional 
Answer Sheets Compared with Examinees 
Using Mark-Sense Cards on 315 
Tests by Mean Scores # 


Number 
of 

Answer Forms Compared Tests Proportion 
Conventional answer sheet 

means higher 5 AS 
Mark-sense card means higher 114 36 
Means equal 49 16 

Total 315 1.00 


a Airman Proficiency Tests administered 
during March-August 1958 to 96,720 airmen, 


Air Force-wide 
ing which the IBM 650 has been used have 
resulted from examinee failure to properly 
identify the test which was administered. This 
error rate has been less than 5%. 


not be scored 
machines, 

of conventional 
Procedures to mark- 


the mark-sense cards, 
Control, and 
me excepting 


; J ees for marking 
their respective test answer forms. There js 


no apparent reason to believe that the per- 
tinent characteristics of the two Populations 
may have differed in ways other than at ran- 
dom. Accordingly, a test of the null hypothe- 


Ernest Madril 


sis that the change in answer forms would 
result in inferior test performance by the 
group having used the mark-sense cards seems 
appropriate. Three hundred fifteen different 
tests were administered to as many pairs of 
examinee populations. The raw score means 
attained by the paired populations each were 
compared. One hundred sixty-three of the 
tests based on the use of answer cards yielded 
means equal to or lower than those based 0? 
conventional answer forms. The standar 
error of the observed proportions is .74- 
divergence of .02 in favor of equal or lower 
means might be expected to occur by aan 
27 times in 100. Therefore, it is conclude 
that the use of mark-sense cards as multiple- 
choice test answer forms does not adversely 
affect the test performance of Air Force pe 
sonnel administered Airman Proficiency Tests: 
The conclusion is further confirmed by > 
distribution of mean scores resulting from o 
two administrations of the tests. Only 

of the 114 lower means of the mark-sene 
card populations were below 8 score points © 
their respective answer sheet population 
means. - 

In summary, the transition from the COP 
ventional multiple-choice answer sheets SE 
test scoring to mark-sensing and sco" is 
through the aid of electronic computers. ig 
feasible and less expensive, when process! al 
large volumes of test data, than convention: 
methods. Outside of reduction in costs; ilo 
of greater importance to consider the s17. 
Increase in data that readily become aval? ge 
for analysis and evaluation. This augmen ig- 
tion in available data should result in the the 
velopment of information essential t° on- 
Improvement of paper-and-pencil tests á 
structed for the Air Force. 

As optical reading technology develop 
need for key punching alphabetic inform 
will be further reduced. It is conceive th 
that if all examinees were to be provided “ihe 
plastic identifying plates embossed wit! in- 
necessary fixed alphabetic and numeric cial 
formation such as name, date of birth, 5° $ 
Security number, the plates could ie ple 
used in recording such information in suit@ 
form for the preparation of punched ca" ders 
optical readers. The use of optical rere 
will further enhance the employment of ™ 


s, the 
ation 


IBM Mark-Sense Cards as Answer Forms 


sense cards in the future, in that the readers 
can be made more sensitive and can have 
greater discriminating power than the usual 
devices which rely on conductive rather than 
Photoelectric sensing. 


Received November 20, 1958. 


References 


Appel, V., & Cooper, G. A refinement in the use of 
mark-sense cards for test research. J. Amer. Sta- 
tist, Ass., 1955, 50, 557-560. 


301 


Deemer, W. L., Jr. The use of mark sensing in a 
large scale testing program. J. Amer. Statist. Ass., 
1948, 43, 40-52. 

Educational Testing Service. A study of a new type 
answer sheet. Statist. Rep., 1956, No. 56-23. 

Gilhooley, F. M. Proficiency test development and 
research for the Airman Career Program of the 
United States Air Force. Amer. Psychologist, 1956, 
11, 547-553. 

Morton, Mary A., Hoyt, W. T., & Burke, L. K. 
A new type of test answer sheet. Amer. Psycholo- 
gist, 1955, 10, 572. 


nal of Applied Psychology 
Vi, No. 5, 1959 


RSONAL AND SOCIAL 
RELATIONSHIPS BETWEEN PE 
DESIRABILITY SETS AND PERFORMANCE ON 
THE EDWARDS PERSONAL PREFERENCE 
SCHEDULE 


ALFRED B. HEILBRUN, JR. axb LEONARD D. GOODSTEIN 


State University of lowa 


The relationships between desirability fac- 
tors and probability of endorsement of self- 
descriptive statements on personality ques- 
tionnaires have aroused considerable research 
interest. Investigations of the Minnesota 
Multiphasic Personality Inventory (Hanley, 
1956; Rosen, 1956) and the Edwards Per- 
sonal Preference Schedule (Edwards, 1953) 
have demonstrated a high positive correlatio; 
between the judged social desirability (i.e., 
the perceived desirability of a given trait 
in others) of a test statement and its selec- 
tion as self-characteristic, Edwards (1954) 
matched pairs of Statements for social desir- 
ability in developing his test to minimize the 


importance of social desirability as a response 
determinant. 


n 


nse set may not ac- 
ariance attribut- 
(1958) found a 
en need scores 
Schedule (PPS) 
rability of these 


res on the PPS 


are uncorrelated with judged social desirabil- 


ity of the need, it was hypothesized that 
personal and social desirability sets might 
produce different test-taking behaviors, Good- 
stein and Heéilbrun (1959), in evaluating 
the relationship between social and personal 
desirability values of the individual Pps 
statements, found a very high correlation 
(r= .90) between the two sets of values, 
While the extent of the correlation limited 
any independent variation, a significant dif- 
ference between social and persona] statement 
values was demonstrable when only state- 


ments associated with judged-low a 
desirable needs were considered. No bility 
ence between personal and social dein rs 
values was found for the high personally ug- 
sirable need statements. These gatho o. 
gested that Edwards’ procedure of oe be 
items by social desirability values Te er- 
less effective with statements meaa T that 
sonally undesirable needs to the seen 
Ss assume a personal desirability test- 
set. F j- 
Edwards (1953) has reported a high p a 
tive correlation (r = 87) between paca 
sirability values and probability of ail 
endorsement. In an unpublished se tion- 
brun and Goodstein found a similar ‘bility 
ship (r = .90) between personal a 
values and probability of endorsement. m on 
of the two statements comprising an ial de- 
the PPS have both a personal and pene 
Sirability scale value, Because the sta ‘ett 
pairs on a number of items are not pe" com- 
matched for their desirability ve n 
binations of high and low values are P A 
Indeed, it is possible to find items on a desit- 
where the higher personal and socia iffere? 
ability values are associated with di 


er 
nd pe" 
Statements in the item. If social exit di- 


fi theif 
ment 


operation of desirability sets may ee 
ated. The purpose of the present stn ersonll 
to investigate the hypothesis that a r the 
desirability set affects performance desi" 
Edwards PPS independent of a socia 
ability set. 


302 


Desirability Sets and Performance on EPPS 


Table 1 


Data Concerning Statement Endorsement in Three Groups of Items (N = 20) 
Selected from the Edwards Personal Preference Schedule 


Mean Difference in 


Mean Difference in 


Mean Number of 


Paired Statement Paired Statement Higher Socially 
Social Desirability Personal Desirability Valued Statements 
Values Values Endorsed SD 
Group 1 0.28 9.50 3.04 
Group 2 0.28 10.81 2.44 
Group 3 1.03 13.16 2.82 
Method the higher social and personal values were associated 
Subjects with the same statement in a pair. The expectation 


A sample of 248 undergraduate students (166 males 
and 82 females) enrolled in an introductory psycho- 
ogy class at the State University of Iowa were ad- 
Hee the Edwards PPS under standard condi- 
ions, 


Doe, 
Procedure 


Social desirability scores, derived from pooled judg- 
ments and converted into arithmetic values by a 
Successive interval scaling method described by Ed- 
Wards (1957), were available for each statement in 
the PPS2 Personal desirability values, derived in 
the same fashion (Goodstein & Heilbrun, 1959), 
Were also available. 

Three groups, cach including 20 items (i.e. pairs 
o statements), were then selected from the PPS ac- 
cording to certain criteria. Group 1 consisted of 
those items showing the maximum difference in the 
Social and personal desirability values of the state- 
ment pairs and in which the higher social desirability 
Value was associated with one statement in a pair 
and the higher personal desirability value with the 
Opposing statement. 

Group 2 consisted of 20 items matched precisely 
With Group 1 items for the magnitude of difference 
in the social desirability values of the two state- 
Ments. Group 2 items differed from those in Group 
l in that the higher social value and the higher per- 
Sonal value were both associated with the same state- 
Ment in an item pair. In Group 2 both types of de- 
Sirability values would lead to the prediction of the 
Same and more highly desirable response in a pair, 
Whereas in Group 1 these values would predict op- 
Posing responses in the pair. Since differences in so- 
cial desirability values for statement pairs in Groups 
„and 2 are identical, less endorsement of higher so- 
cially valued statements in Group 1 could then be 
attributed to the personal value differences which 
Predict selection of opposing statements. 

roup 3 consisted of 20 items which showed the 
maximum difference between the social and personal 
—SSirability values of the statement pairs and where 


` The authors wish to thank Allan Edwards for 
t values available 


Makin, 3 
to ie unpublished statemen 


here would be a stronger tendency to select the re- 
sponse having the higher social desirability values 
than would be found in either Group 1 or 2. The 
analysis of Group 3 response selection is not crucial 
to the hypothesis under investigation but is included 
since it provides an estimate of response selectivity 
under conditions where desirability factors should be 


most influential. 

The systematic influence of a response set based 
upon position or order was avoided, since half the 
responses predicted by the higher social desirability 
value were first statements of the pairs and half 
were the second statements. 


Results 


The basic assumption underlying the ex- 
perimental predictions in the present study is 
that Ss will tend to select the item statement 
showing the higher social desirability value. 
This was confirmed by a preliminary analysis 
of statement endorsement. Out of the 204 
PPS items (eliminating six items with identi- 
cal social desirability scores for both state- 
ments), Ss selected an average of 114.07 
higher socially valued statements. This dif- 
fers significantly from a chance expectancy 
of 102 (t= 14.54, p < .001). 

The mean number of statement endorse- 
ments associated with higher social desirabil- 
ity values in Groups 1, 2, and 3 is presented 
in Table 1. Means for Groups 1 and 2 differ 
significantly from each other (¢ = 5.33, p < 
(001) in the direction of fewer endorsements 
of higher socially desirable statements in pairs 
when personal desirability value differeneces 
would predict selection of the opposing re- 
sponse. 

Tt can be seen in Table 1 that the average 
endorsement of higher socially valued state- 
ments in Group 3, in which paired statement 


304 


differences are large but in the same direction 
for both desirability values, exceeds that for 
both Groups 1 and 2. The difference in mean 
endorsement between Groups 1 and 3 is 
highly reliable (¢ = 13.79, P< .001), as is 
the difference between Groups 2 and 3 (t= 
9.89, p < 001). 


Discussion 


The finding that Ss showed less of a tend- 
ency to select more highly socially desirable 
statements as self-characteristic when per- 
sonal desirability values would predict en- 
dorsement of the Opposing statement of the 
pair than when personal values would predict 
the same response lends support to the no- 
tion of some independent contribution of a 
personal desirability set to statement endorse- 
ment on the PPS beyond that attributable to 
a social desirability set, 

It is difficult to assess the practical signifi- 
cance of the finding that personal desirability 
values are related to Performance on the PPS 
but were not considered in the original state- 
Since the differ- 
nt frequencies in 


The matching Procedures 
tioned in that th 


ently allow desirability 
in statement endorsement, 
study averaged endorsing mor 
socially valued statements in 
items (Group 3). This find 
ment with that of Borislow 
who concluded that the PPS c 
der personal and social desirability instruc- 
tions, but he modified this by adding that the 
Schedule “is not greatly susceptible to the in- 
fluence of fakability in terms of choice of so- 
cially desirable items, per se.” Although the 


s in this 
e than 13 higher 
a set of 20 such 
ing is in agree- 
(1958, p, 27, 
an be faked un- 


Alfred B. Heilbrun, Jr. and Leonard D. Goodstein 


number of items included in Group 3 in the 


i ideri ei 
Present study is not great considering the | 


total of 210 items from which scale a 
are derived, a closer inspection of these tch 
items suggests that the failure to sm 
closely could have a marked effect on the 
tain individual scales. It was found ae 
20 statements having the higher desira $ a 
values were fairly well spread with papet 
the need scale which each measures ea 
test (Endurance, 4: Intraception, 3; al 
ance, Dominance, Order, Affiliation, He au 
sexuality, 2 each; Exhibition, a to 
Nurturance, 1 each). Thus the ae on 
match closely should tend to inflate scor x 
at least 10 scales, with the et icant 
Intraception scales being affected most. lower 
ever; when the statements having the ae 
desirability scores in the 20 pais wstriking 
amined, the results were much more = i 
(Aggression, 9; Exhibition, Denant pe ind- 
Abasement, 3 each; Autonomy, 2). 32%) 
ing that in at least 9 of the 28 items Os 
in which statements reflecting need an 
appear these statements are inismdtehE n ists 
lower in desirability value certainly area 
Caution in the utilization of the Ags the 
Score. Even more caution is suggested PER 
finding that in 14 of the remaining state- 
involving Aggression statements, ue desir- 
ments have lower social and sagen 
ability values, in three items the A88 al de- 
statements have lower social or age 
Sirability values, and in only tn r per” 
the aggression statements have me hough 
sonal and social desirability values. , match- 
further specific checks for systematic investi- 
ing flaws is beyond the scope of this intro- 
gation, it is worthwhile noting that Maps by 
duction of error into any scale on m to in- 
improper matching procedures is bou! es be 
troduce error into the remaining sc4 test 
cause of the forced choice nature of fi 
(i.e., the failure to select one stateme 
pair as self-characteristic automatica 
signs a score to another scale). 


t in @ 


Summary the ny” 

This study was concerned with | se oP 

pothesis that a personal one socið 

erates somewhat independently © onse i 
desirability set in determining resp 


a 


—— 


Desirability Sets and Performance on EPPS 


lection on the Edwards Personal Preference 
Schedule. To test this hypothesis 248 col- 
lege Ss were administered the PPS. 

Three groups of 20 items each were selected 
from the PPS: one group included items hav- 
ing a maximum difference in the social and 
Personal desirability values of the paired 
statements in each item and for which the 
two types of desirability values predicted 
Selection of opposing statements (with the 
higher socially valued statement predicted for 
endorsement); the items in the second group 
were precisely matched with those in the first 
Sroup for between-statement differences in so- 
Cial desirability values, but personal values 
Predicted endorsement of the same state- 
ments; the third group of items were those 
in which there was a maximum difference in 
the personal and social desirability values of 
the paired statements but in which both 
values predicted the selection of the same 
response, 

The hypothesis of some independent effects 
of personal and social desirability sets upon 
response endorsement was supported by the 
finding that significantly fewer of the higher 
Socially valued statements were endorsed 
when personal values predicted endorsement 
of the opposing statement than when the per- 
Sonal values predicted selection of the same 
response. 

When the differences between personal and 
Social desirability values of paired statements 
Were maximal and both values predicted the 
endorsement of the same response, Ss aver- 


305 


aged selecting the higher valued statement 
over 13 out of 20 times. Inspection of these ` 
20 items indicated that specific scales, pri- 
marily the need Aggression Scale, may be 
especially vulnerable to a desirability set be- 
cause of mismatching and should be inter- 
preted with caution. 


Received November 21, 1958. 


References 


The Edwards Personal Preference 


Borislow, B. 
J. appl. Psychol, 1958, 


Schedule and fakability. 
42, 22-27. 

Edwards, A. L. The relationship between judged 
desirability of a trait and the probability that the 
trait will be endorsed. J. appl. Psychol., 1953, 37, 
90-93. 

Edwards, A. L. Personal Preference Schedule. New 
York: Psychological Corp., 1954. 

Edwards, A. L. Techniques of attitude scale con- 
struction. New York: Appleton-Century-Crofts, 
1957. 

Goodstein, L. D., & Heilbrun, A. B. The relation- 
ship between personal and social desirability values 
of the Edwards Personal Preference Schedule. J. 
consult. Psychol., 1959, 23, 183. 

Hanley, C. Social desirability and responses to items 
from three MMPI scales: D, Sc, and K. J. appl. 
Psychol., 1956, 40, 324-328. 

Heilbrun, A. B. Relationships between the Adjective 
Check List, Personal Preference Schedule, and de- 
sirability factors under varying defensiveness con- 
ditions, J. clin. Psychol., 1958, 14, 283-287. 

Navran, L., & Stauffacher, J. C. Social desirability 
as a factor in Edwards Personal Preference Sched- 
ule performance. J. consult. Psychol, 1954, 18, 
442. 

Rosen, E. Self-appraisal, personal desirability, and 
perceived social desirability of personality traits. 
J. abnorm, soc. Psychol., 1956, 52, 151-158. 


Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


SUBORDINATES’ PERCEPTIONS OF THE PRODUCTIVE 
ENGINEER 


ROBERT E. STOLTZ 


Southern Methodist University 


Of major importance in our attempts to un- 
derstand and predict the behavior of persons 
is our knowledge of how the individual per- 
ceives not only himself, but those others who 
occupy space in his particular psychological 
field. This report summarizes an exploratory 
study dealing with how young engineers in a 
unique student-subordinate role perceive the 
engineers they chose to term “Productive.” 

There are at least three reasons why a 
study of this type is useful and needed, First, 
it should provide suggestions for hypotheses 
about the productive Process in this field that 
may be further studied in more highly con- 
trolled and more specific inquiries, Second, 
it is planned that a future study will attempt 
to compare these data with similar data pro- 
vided by a sample of engineering research su- 
pervisors. This would enable one to deter- 
mine what discrepancies, if any, exist between 
these two different status levels, 
results are useful in their own righ 
us a better idea of what perfor 


an elementary Psychology Course, forty students in 


se, I and numb 
months engaged in industrial work Ps 


ing program were not Significantly 
different between the two sections, he rind 
number of months of industrial experience for the 
combined groups was 19.8 months, 


Instrument 


The instrument used in this investigation was the 
Productive Behavior Checklist developed by Stoltz 


and described in detail elsewhere (Stoltz: E 
1959). Briefly, this checklist consists of 250 e 
ments taken from interviews with physical EI 
and engineering research supervisors. Pages 0 ini- 
checklists were assembled in random order to E 
mize any consistent tendency for certain items PR 
rated differentially due to their serial position wI 
the checklist, 


Procedure 


> e 

One section of the Ss was given copies a ihe 
checklist and asked to describe the behavior © each 
most productive engineer they knew by ra PE ajea 
item on a five-point scale according to how he par- 
the behavior described by the item was gi S 5 for 
ticular person they were rating. A FADE ae The 
an item would indicate a very typical behavio é 
Second section of the Ss was also given copies t pro- 
checklist, but were asked to describe the m were 
ductive engineer they knew. All of ine dividual 
cautioned to describe only one particular ahve en- 
and not to describe productive or nonproducs to be 
gineers in general. They were also encourage weak- 
as fair as possible and indicate the person’s 
nesses as well as his strong points. by 

The analysis of the individual items was peg 
computing the ¢ ratio between the means iat rat- 
item of the productive and nonproductive Se isting 
ings. Since the limitations of space preclude in 
the statistics for each of the 250 items ee ne 
checklist, only the 30 most discriminating Te in 
30 least discriminating items will be given 
Tables 1 and 2, respectively.1 


Results 0 pos- 
One hundred and fifty-one of the a m 
sible ¢ ratios were significant at Or seve 
the .01 level of confidence. This high rea- 
of confidence was selected since it we halo 
Sonable to assume that some amount 0 s be- 
effect might tend to inflate the diferents ex- 
tween the items. This effect might to the 
pected in view of the values attached a 
terms “productive” and “nonproductive 
due to the favorable wording of almo, 
the items. While it is quite gore 
Some halo effect accounts for a per seems 
the remaining significant differences, Í i 
sg theil 
1A complete list of all the items, giving, opi 


means, standard errors, and ¢ ratios, May 
tained by writing the author. 


306 


Subordinates’ Perceptions of the Productive Engineer 


Table 1 


Thirty Most Discriminating Items 


Item i 
Is a clear thinker 7.76 
Is very efficient 7.22 
Can handle anything given him 6.90 
Keeps his mind on his work 6.88 
Does more than his share of the work 6.86 
Is not easily distracted from his job 6.71 


Has real interest in his job 6.62 


Has good attitude toward his work 6.60 
Has ingenuity 6.48 
Has good leadership qualities 6.38 
Comes up with new ways of doing things 6.26 
Does not go off on tangents 6.26 
Has more than casual interest in his work 6.25 
Can grow into a job 6.11 
oes not seem lazy 5.96 
Develops respect in other people 5.89 
Takes responsibility well 5.78 
Is a self-starter 5.64 
Is technically competent 5.61 
Has a store of information to apply to problems 5.49 
Can pick out important details in a problem 5.46 
Is orderly in his work 5.45 
Is conscientious 5H 
5.39 


Can organize the work of others 
Thinks of better ways of doing things 5. 


Good, sound technical background 5.29 
Technical background is above average 5.28 
Can evaluate alternative approaches to problem 5.20 
Offers his share of creative ideas A a 


irects work of others effectively 


p ey —— 
nA E ae Gaal at the item wa: 
—Positive Z ratios indicate that the Het) Te engi x 

More typical of productive than of nonproducthy engineers, 
ith 78 df, a £ must exceed 2.64 to be significant at the / . 


equally likely, in view of the design, that 
there are real differences in how the produc- 
tive person is perceived relative to the non- 


Productive by the Ss. 


Discussion 


An analysis of the content of the discrimi- 
nating items indicated that a summary of the 


Stereotype existing for the Ss might best be 


Presented by considering their comments to 
s. This clas- 


fall into four major content area : 
Sification was arbitrarily made, and the ea t 
Sories are not assumed to be mutually inde- 
Pendent or exclusive. 


Intellectual Activity 


The productive engineer is seen as a versa- 
tile person, intelligent, with good analytical 
reasoning ability. This versatility is appar- 
ently restricted to activity within the engi- 
neering area, and it is not clear whether the 
subordinates consider it to extend into other 
activities. The subordinates indicate that this 
intellectual activity is controlled by a sense 
of the practical problems involved in a task, 
and, while the productive engineer is willing 
to try unique approaches or attempt new 
methods in search of a solution to a prob- 
Jem, the practicality of the solution is an im- 
portant determinant in his selection of an 
approach. Throughout their comments the 


Table 2 


Thirty Least Discriminating Ttems 


Item t 
Does not hide behind his degree 0.00 
Likes administrative work 0.00 
Does not fly off the handle —0.15 
Active in social groups in the company 0.18 
Does not jump into an explanation 0.20 
Ts emotionally stable 0.20 
Ts not gruff at times 0.23 
Ts not cold and aloof 0.36 
Active in company sponsored activities 0.36 
May be curt sometimes 0.38 
Ts at ease with others 0.40 
Can’t stand to be unsuccessful 0.48 
Is not bound by limits 0.49 
Does not worry about trivialities 0.62 
Difficult to get excited 0.64 
Willing to put in extra time 0.71 
Interested in some things outside of his field 0.83 
Does not offend others 0.91 
Ts not irritable 0.92 
Does not hurt others’ feelings 0.92 
Wants recognition 0.92 
Has a need to be recognized —0.37 
Ts intelligent but not the brightest —0.56 
Has a lot of outside interests —0.62 
Sometimes impatient with others —0.73 
May try to do everything himself —0.78 
Js not impatient with others 1.02 
Considers the problem of money 1.08 
Does not hurt feelings of others unnecessarily a 


Has imagination 


Note.—Positive £ ratios indicate that the item was rated as 
more typical of productive than of nonproductive engineers, 
With 78 df, a t must exceed 2.64 to be significant at the .01 level. 


308 


subordinates appear to make a distinction 
between the immediate task area, ie., engi- 
neering problems, and activities that are pri- 
marily outside of the task area. This distinc- 
tion is marked in their comments regarding 
the producer’s creativity. The Productive 
engineer is felt to be highly creative within 
his area of competence, particularly as re- 
gards his novel use of past experience, but he 
is apparently not seen as being creative in a 
more general sense. These views of the sub- 
ordinates agree in many respects with the 
findings of Harrison, Hunt, and Jackson 
(1955a, 1955b, 1955c) regarding the per- 
formance of mechanical engineers on a num- 
ber of common psychological tests. 


Motivation 


One of the most marked characteristics of 
the productive engineer is his tremendous in- 
terest in his work. More specifically, his in- 
terest in problems within the engineering field 
is seen as quite Strong, but he is not perceived 
as more strongly interested in the company 
than the honproducer, His interest in his 
work group is apparently seen as somewhat 
intermediate jn intensity. Earlier research 
on the perceptions of physical science re- 
search supervisors indicated that the more 
highly motivated scientists might be more 
prone to aggressive attacks on fellow workers 
than the less motivated Scientists, 
pothesis receives little 
subordinates, 


hypothesis, but 


again 
does not appear to be Confirmed. Perhaps 
the answer to this apparent Contradiction of 


Ss used in this study. It might be hypothe- 
sized that the subordinates do not interpret 
aggressive responses directed toward them- 
selves by the productive persons as aggressive 
or as unwarranted, although similar remarks 


Robert E. Stoltz 


directed by nonproducers might be so a 
preted. The halo effect which was a 
earlier might have operated to AA hy 
importance of aggressive acts by the ae 
regarded productive engineers. There ear 
Provision in the present study for 
either of these hypotheses. , m 
The producer is seen as making a mm 
of distinctions concerning when and w ie 
this motivation can be expressed. The S t 
ordinates see him as being quite wiin 
take work home, but no more willing a job 
nónproducer to put in extra time on T nl 
(this presumably means within the i o 
confines of the plant). Other nina indi- 
the subordinates follow this pattern ant ation 
cate that the producers make a ee the 
between time, or perhaps effort, spent bardi- 
job and time spent aż the job. The acting 
nates see the productive engineer as ra ja 
negatively to the physical confines ae 
quirements of the plant, but willing an ation. 
to produce in a more self-regulated oa 951, 
The findings of Van Zelst and = atary 
1952, 1954) regarding the belief in vo racter- 
determination of deadlines as a beeen per- 
istic of productive scientists support tht 
ception of the subordinates. 


Personality and Social Factors 


. de- 
The producer is seen as having E eed 
sree of independence needs and a wit 
again chiefly within the job area, an ng Te- 
a definite orientation toward acceptir srete 
sponsibility. This should not be soe be 
as describing completely what B ariet 
termed “lone wolf” behavior. The p" to 
is not seen as a person who attemp s i seek 
everything himself, but as one who W. Again 
out assistance and help from others. be the 
it appears that task orientation wight will 
major determinant of whether or Hk whic 
sacrifice independence for the suppor dinates 
he might need. Although the ee 
See the producer as accepting help, willing 
not seem to feel that he is any a en 
than the nonproducer to seek help however, 
experts. The subordinates indicate, gate 
that the producer is probably more hañ the 
his shortcomings and deficiencies i mig 
nonproducer. A suggested Bygothe is, 
be that the less task involved the pe" 


Se 


2 


Subordinates’ Perceptions of the Productive Engineer 309 


the less he will seek support from others. 
Harrison et al. (1955a, 1955b, 1955c) have 
indicated that there is a tendency for me- 
chanical engineers to be somewhat authori- 
tarian in their own activities and for them 
to accept authoritarian solutions somewhat 
readily. Perhaps the better answer to the be- 
havior of the producers in this area will be 
Some combination of hypotheses regarding 
acceptance of authority and ego involvement 
ìn the task. 

The subordinates do not appear to view 
the producer as any more likeable or agree- 
able than the nonproducer, but they do 
Consider him to be somewhat more mature, 
although perhaps more aloof, than the non- 
Producer. The nonproducers are seen as be- 
ing more “thick-skinned” than the producers 
and as being more likely to upset others and 
to be quick toward others. The content of 
these items suggested that what the subordi- 
nates might be describing were reactions of 
the honproducers which interfered with the 
Nonproducers’ task performance or which 
Were expressions of the nonproducers’ own 
frustration. In order to clarify this problem, 
a number of the Ss were questioned regard- 
ing their responses to these items. The item 
regarding “thick-skinned” behavior was inter- 
Preted by some of the Ss as referring to the 
Unwillingness or inability of the nonproducers 
to Perceive or accept attempts made to cor- 
Rect or improve their performance. More- 
Over, the responses of the nonproducers to 

se attempts, presumably in the light of 

€ir own perceptions of themselves as non- 
Producers jn an environment where produc- 
tion is highly regarded, tended to be “upset- 
ting” to the subordinates and others 1n the 
ork groups. 
80 Che producer is not seen 

cial person, and the data S 
Might even be less likely than t 
ion to have an active social life 

Mpany group. 


as a particularly 
uggest that he 
he nonpro- 
outside the 


Adin: 
Ininistrative Activity 


in The subordinates see the producer as havs 
We the ability to capably administer his = 
i Ork and the work of others, but as not . 
8 Particularly fond of administrative Work. 


is might well be due to the belief, or fact, 


as some of us feel, that administrative work 
is typically routine work. The subordinates 
point out quite strongly that the productive 
person does not like routine work, hence he 
probably does not look with favor on adminis- 
trative assignments. 

The producer is seen as being efficient in 
handling the work assigned to him, but the 
subordinates do not consider him as being 
less likely than the nonproducer to emphasize 
nonessentials or to get wrapped up in tech- 
nical details. Extreme rigidity in task be- 
havior does not seem to be a characteristic of 
the producers, as the subordinates point out 
that the tendency to “think of things as either 
black or white” is more characteristic of the 
nonproducers. 

The importance of communication in the 
behavior of the producer is indicated by the 
subordinates. Their comments suggest, how- 
ever, that they tend to evaluate oral com- 
munication more highly than written com- 
munication. 

The producer, as might easily be expected, 
is accorded a high degree of respect by the 
subordinates and is considered by them to be 
a person for whom they might like to work. 
The data suggest that this feeling of the sub- 
ordinates is much more akin to respect than 
it is to a general liking or feeling of warmth 
toward the producer. 

At this stage of inquiry, it is best to gen- 
erally regard these findings as indicative only 
of a stereotype of the productive engineer 
existing within our particular sample of sub- 
ordinates. Whether or not this stereotype is 
valid in the sense of accurately describing the 
behavior of engineers who are productive in 
terms of other, perhaps more objective cri- 
teria, remains to be seen. The stereotype 
probably is valid in the sense of its operating 
with the work situations to influence the be- 
havior of the subordinates and to influence 
the subordinate’s attempts to vary or alter his 
own behavior in order to reach a highly 
valued position. For example, from this 
analysis we might expect the subordinate to 
value skill in oral communication more highly 
than skill in areas of written communication. 
The subordinate may then attempt to rely 
on his verbal ability, particularly his ability 
to speak glibly in the Carnegie manner, and 


310 


reject or even oppose efforts to increase his 
ability to express himself in more formal, 
written efforts. The extent to which this 
value is shared by his superiors may well de- 
termine his own success or failure as an engi- 
neer in terms of a criterion of approval by 
superiors. 


Summary 


Eighty student engineers enrolled in the 
cooperative training program of a large south- 
western university described the work behav- 
ior of productive and nonproductive engineers 
using a 250-item checklist developed in an 
earlier study. Approximately 60% of the 
items within the checklist discriminated sig- 
nificantly between the descriptions of the 
productive and nonproductive engineers at or 
beyond the .01 level of confidence. A de- 
scription of the stereotype of the productive 
engineer as seen by the subordinates w. 
veloped from an inspection of the res| 
to the items. Tables indicating the 3 
and 30 least, discriminating items are 


as de- 
ponses 
0 most, 
given. 
Received November 24, 1958, 


Robert E. Stoltz 


References 


Harrison, R, Hunt, W., & Jackson, T. A. Profile 
of the mechanical engineer: I. Ability. Personnel 
Psychol., 1955, 8, 219-234. (a) 

Harrison, R., Hunt, W., & Jackson, T. A. Profile 
of the mechanical engineer: II. Interests. Person- 
nel Psychol., 1955, 8, 315-330. (b) 

Harrison, R., Hunt, W., & Jackson, T. A. Profile 
of the mechanical engineer: III. Personality. Per- 
sonnel Psychol., 1955, 8, 469-490. (c) b 

Roe, Anne. A psychological study of physical sci- 
entists. Genet. psychol. Monogr, 1951, 43, 1a 
235. (a) sii 

Roe, Anne. Psychological tests of research scientists. 
J. consult. Psychol., 1931, 15, 492-495. (b) i 

Stoltz, R. E. Development of a criterion of rD 
productivity. J. appl. Psychol, 1958, 42, 308-31 

Stoltz, R. E. Factors in supervisors’ perceptora a 
Physical science research personnel. J. appl. P9) 
chol., 1959, 43, 256-258, f 

Van Zelst, R. H., & Kerr, W. A. Some correlates be 
scientific and technical productivity. J. abnonik 
soc. Psychol., 1951, 46, 470-475. " 

Van Zelst, R. H., & Kerr, W. A. A further note 0" 
some correlates of scientific and technical produc 
tivity. J. abnorm. soc. Psychol., 1952, 41, 129. ds 

Van Zelst, R. H., & Kerr, W. A. Personality self 3 
Sessment of scientific and technical personnel. 
appl. Psychol., 1954, 38, 145-147. 


lll o 
ee 
ET Ie 
= a eee 

SS IS 


Journal of Applied Psy: 
Vol. 43, eg Peg Leholaey 


FACTOR ANALYSIS OF REPORTED MINOR 
PERSONAL MISHAPS* 


J. D. KEEHN 


American University of Beirut 


ie Laie (Alexander, 1952; Dunbar, 
1944) nglish & Pearson, 1945; Rowson, 
fain are generally of the opinion that cer- 
ae are more likely to have re- 
losis accidents than others, some psycho- 
ogee like Webb (1956), Arbous and 
Sey feat (1951), and a number of other re- 
a riters have argued that the concept of 
ingly | proneness” has not been convinc- 
Thu emonstrated by accident statistics. 
“the s on the one hand it is claimed that 
A accident prone individual is an impetu- 
Pa [who] harbors a deeply ingrained 
is eo i a the excessive regulations of 
Shee ringing . . . [and] has a strict con- 
saben e which makes him feel guilty for his 
the he (Alexander, 1952, p. 214) and on 
studie PE that large scale Air Force and Navy 
may S give strong evidence that accidents 
dent e be predicted from preceding acci- 
work ehavior” (Webb, 1956). That is, some 
Beiter deny that accident likelihood can be 
Bie from past accident records, while 
Sica believe that a particular accident per- 
stich ity can be described precisely because 
L predictions can be made. 
of ti scale studies can be cited in support 
itest. view and it is unlikely that extensive 
way ìgations of industrial, aviation, Or high- 
fee accidents will do much to clarify the 
this a Although there are many reasons why 
i should be so, the most likely one 1s that 
with, scale studies concern themselves only 
toh actual accidents which are serious enough 
a reported and which occur only in a nar- 
io fange of situations, Hence Webb (1956) 
Sele reful to say that aircraft pilots cannot be 
cted “on the basis of aircraft accident his- 


Ories” a 
SL although he leaves open the question 


lhe 
th This study was made possible by a grant from 
Di jg ockefeller Foundation to the Arts and Sciences 
knowl of the American University of Beirut. Ac- 
anq edgment is made to the Rockefeller Foundation 
al-Issą Emma Oshagan, Rita Tabourian, and Ihsan 
a 


o 
the q ee assisted in the collection and analysis of 


311 


of the predictive nature of other, minor mis- 
haps. When it is noted that nonindustrial 
home accidents were almost double industrial 
accidents in 1951 (National Safety Council 
1952) the restrictive nature of specific ie 
trial accident data is apparent. Similarly. 
near accidents where prompt action of one m 
dividual averts injury to another do not usu- 
ally become incorporated into accident sta- 
tistics. 

That “nonspecific” accidents might predict 
aircraft accidents due to “pilot error” has 
been shown by Kunkle (1946). Thus per- 
sonal injuries like sprains, cuts, fractures, and 
dislocations were significantly related to fly- 
ing accidents in the group of pilots that he 
studied. On the other hand, mishaps like 
falling down stairs, trapping fingers in doors, 
d driving accidents showed no such rela- 
tionship. It is possible, then, that there are 
certain classes of “personal” accidents and 
that some of these classes may be predictive 
of other accident classes in industrial, aircraft, 
or driving situations. The present study sets 
out to investigate the first of these possibili- 
ties in a Near Eastern cultural setting. 


an 


Method 


A questionnaire containing 41 statements about 
minor mishaps was administered to 100 male uni- 
versity students between the ages of 18 and 25 years. 
Most of the Ss were ‘Arabs. The questionnaires were 

ly and individually by the Ss as 


filled out anonymous! 
part of a larger study in which they were paid for 
their services. University students rather than in- 


dustrial workers were used as Ss in order to over- 
come the possibility that some Ss might falsify their 
responses through fear of jeopardizing their jobs. 
Anonymous responses were requested as a further 
check against possible falsification, The question- 
naire and instructions are shown in Table 1. 
Undecided and negative responses were“ combined 
and tetrachoric correlations between the items com- 
puted by means of the tables of Chesire, Saffir, and 
Thurstone (1933). Items on which more than 80% 
of the responses fell into one or other category were 
excluded from further consideration owing to the 
unreliability of tetrachoric 7’s when computed from 


312 J. D. Keehn 


Table 1 


Accident Index 


" rour 
yi q ion b; ing ir ves” or ve ot make up y 
Will you please answer each question y putting a circle round “yes” or “no. f you cann 


mind, circle the “?.” Work quickly and do not worry too long about the exact mean: 


are no right or wrong answers, and no trick questions. 
you can. 


1. Do you often seem to cut yourself 


when you use sharp things? Yes ? No 
2. Do you often bump into things and 
hurt yourself? Yes ? No 


3. Have you ever eaten bad food or acci- 
dentally drunk a poisonous liquid? Yes ? No 

4. Do you tend to make mistakes when 

you are writing? Yes ? No 

ay Have you ever accidentally torn a 
book or newspaper or similar object? Yes ? No 

6. Have you ever trapped your finger in 
a door? Yes ? No 

7. Do people tend to bump into you on 
the street? 

8. Do you find that by the time you 
made up your mind over something 
it is too late? 

9. Asa child did you always seem to be 
hurting yourself one way or another? Yes ? No 

10. Have you ever broken o; 


ne of your 
bones? Yes ? No 
11. Do you tend to drop things and break 
them? Yes ? No 
12. Do you often burn yourself by touch- 
ing hot places? Yes ? No 
13. Have you ever burned your mouth by 
eating or drinking something that was 
too hot? Yes ? No 
14. Did you ever swallow a harmful ob- 
ject as a child? Yes ? No 
15. Would you call yourself a careless 
person? Yes ? No 
16. Are you the kind of person who al- 
ways seems to be knocking things 
over? Yes ? No 
17. Do you think 


you are an unlucky 
kind of person? 


18. Do you sometimes bite 


your tongue 
when talking or eating? 


Yes ? No 
19, Have you ever been almost hit by a 
car or other vehicle? Yes ? No 
20. Do you often seem to be twisting or 
spraining your ankles or wrists? Yes ? No 
eS Se 


extreme cuts. Items 29, 34, and 37 
on this basis. The remaining table 
tions was factored by Thurstone’s ¢ 
using highest column correlations as 
munalities, 


Were eliminated 
of intercorrela- 
entroid method 
estimated com- 


ing of each question. 


There 


‘i irately as 
Remember to answer every question as accu 


29. 


30. 


40. 


- Have you ever accidentally received 


an electric shock? 


- Have you ever hit your finger acci- 


dentally with a hammer? 


- Doyou tend to spill things frequently? 
- Do your belongings seem to wear out 


quicker than you expect? 


- Do you sometimes misunderstand 


what people are saying to you? 


- Do you often tend to lose or misplace 


things? 


- As you walk do you sometimes trip 


over things? 


- Do you find it difficult to write neatly 


without making mistakes or marks on 
the paper? , 
Would you say that you are the kind 
of person who often has accidents? 
Have you ever scalded yourself by, 
for instance, putting your hand in a 
hot liquid or putting your foot into a 
hot bath? 


- Do you frequently bruise yourself? 
- Do you find yourself sometimes for- 


getting things that you know very 
well? 


- Have you ever fallen down stairs? 
- Do you find difficulty in remembering 


which is the hot tap in your bath- 
room? 


- Have you ever mistaken the time 


after looking at your watch? 


- Have you ever felt yourself in danger 


while swimming? 


- Are you the kind of person who is 


frequently late for appointments? 


- Do you have one or more scars on 


your body? 


. Have you ever touched a hot stove or 


similar object by mistake? 
Do you tend to get ink on your fingers 
while you are writing? 


- Do you ever find that people’s feelings 


are hurt by things you say? 


Results 


Yes 


Yes 
Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


r ere 
Most of the item intercorrelations W 


tive and 102 were significant at the - 


¿pecte 
where about 7 would have been eXP 


? No 


z No 


x No 


level 
by 


Factor Analysis of Reported Personal Mishaps 


chance. Even allowing for the fact that the 
correlations are not all independent of each 
other this evidence testifies to the significance 
of the correlation matrix as a whole.? Three 
factors were extracted and centroid and ro- 
tated loadings are shown in Table 2. The 
third factor though of doubtful significance 
was retained to facilitate rotation. Rotations 
were carried out graphically and blindly to- 
wards “orthogonal simple structure.” 

The first point of significance is that the 
first unrotated centroid is a general factor 
with no negative loadings. Of the 38 items 
Included in the analysis, 13 have loadings 
Steater than .50, a further 11 have loadings 
of .40 and above, and only 2 items have load- 
Ings below .20. Apart from the possibility 
that the factor reflects a general tendency to 
agree with the items in the questionnaire it is 
Clear that this finding demonstrates that in- 
dividuals who admit to having accidents re- 
Port having them in a wide variety of situa- 
tions, some of which, like losing things (load- 
ing .59), wearing out one’s clothes quickly 
(loading .52) and hurting people’s feelings 
(loading .46) would not normally be re- 
Sarded as accident situations. 

Interpretation of the rotated factors by 
Means of the items with the highest loadings 
's by no means clear. Factor I’ is character- 
ed by Items 12, 26, and 41, viz.: 


12. Do you often burn yourself by touching hot 


places? , ‘ 
26. Do you often tend to lose or misplace things? 


41. Do you ever find that people’s feelings are hurt 
by things you say? 


The items with the highest loadings on 
“actor IT’ are: 


35. Have you ever mistaken the time after | 
38 at your watch? 
8. Do you have one or more $i 
5. Have you ever accidentally 
Newspaper or similar object? 


ooking 
cars on your body? 


torn a book or 


Factor III’ is best demonstrated by the 
Slowing items: 


“The intercorrelation matrix has been deposited 
e American Documentation Institute. Publi 
ocument No. 6025 from the ADI Auxiliary u a 
o Project, Photoduplication Service, Lil Y 
van stess, Washington 25, D. C» remitting oa for 
6y $1.25 for 35 mm. microfilm oF a ee 
Chi2 in. photocopies. Make checks pay’ : 
hief, Photoduplication Service, Library of Congress. 


313 


27. As you walk do you sometimes trip over 
things? 
40. Do you tend to get ink on your fingers while 
you are writing? 
. Do you sometimes misunderstand what people 
are saying to you? 


w 
On 


If instead of attempting to interpret the 
factors by reference to the most highly satu- 


Table 2 


Centroid and Rotated Factor Loadings* of Each of the 
Items in the Accident Index 
Entered in the Correlation Matrix 


Centroid Rotated 

Item FI FI FIMI FI’ FI’ FII h? 
1 37 30 2 30 -15 40 27 
2 53 26 04 47 02 36 35 
3 24 —23 —10 07 33 07 42 
4 24 11 18 13 —04 29 10 
5 59 —12 —41 51. 52 05 53 
6 49 -26 07 12 41 36 31 
7 34 19 —15 40 05 09 17 
8 40 —13 28 Ot 19 47 26 
9 4 15 3i 21 -04 5 31 
10 42 13 —40 55 24 —05 36 
11 4 2 21 27 05 47 30 
12 47 36 -27 65 02 08 43 
13 09 —54 22 -36 40 22 34 
14 54 —26 08 15 42 40 36 
15 47 11 —39 5 %7 —-01 40 
16 5 2 U 47 07 39 37 
17 20 27 20 18 —20 28 15 
18 39 18 —34 52 16 —02 30 
19 36 —25 —24 20 45 0 24 
20 58 12 —12 50 21 27 37 
21 33 —05 35 00 06 48 23 
22 47 —35 20 -0 42 45 38 
23 57 18 —23 58 20 18 41 
24 52 25 19 37 —03 47 36 
25 55 28 27 38 —08 56 46 
26 59 28 —22 6 12 20 48 
27 54 —30 34 00 36 60 49 
28 22 20 27 12 —16 35 16 
30 45 —26 —16 21 48 16 30 
31 53 18 08 40 06 39 32 
32 15 —05 —29 2 22 -14 12 
33 28 —27 07 =0 32 23 16 
35 54 —45 —10 12 64 26 49 
36 26 —29 —16 os 42 O04 18 
38 39 —37 —18 12 55 10 33 
39 53 —30 04 14 47 36 37 
40 46 32 37 30 —17 58 46 
41 46 39 —22 64 —02 12 42 


a Decimal points omitted. 


314 


items a rough analysis of the content 
mf the ion is made and then compared with 
the factor pattern, a more consistent pattern 
can be seen. Thus Items 1,4, 5; 11, 21, 22, 235 
28, and 40 are all to do with manipulation of 
the hands. Of these nine items, seven have 
their highest loadings on Factor Hr. Six 
items, Nos. 6, 12, 16, 20, 30, and 39 involve 
injury to an extremity where manipulation by 
that extremity is not necessarily implied. Of 
these items, four are most highly loaded on 
Factor II’ and the others on Factor I’. Items 
2, 7, 16, 31, 33, and 38 all pertain to gross 
bodily injuries and four of them have their 
highest loadings on Factor I’ and 2 on Factor 
II’. It is possible, then, tentatively to label 
the factors as depicting injuries due to ma- 
nipulation of the extremities, involvement of 
the extremities, and gross bodily involvement, 
respectively. It is recognized, however, that 
such an interpretation is highly speculative 
and, even if correct, may not generalize be- 
yond the sample used in the present study. 


Summary and Conclusions 


A factor analysis was performed on the in- 
tercorrelations between the responses of 100 
university students, most of whom were Arabs, 
to 38 statements about accidents and minor 
mishaps. A general factor was found to run 
through all the statements indicating that in- 
dividuals who admit to having accidents in 

indicate that they have been 
nts in other situations, Such 
ot contradict the notion of 
ess” and Suggests the possi- 


J. D. Keehn 


bility that some minor accidents and min 
might be predictive of subsequent major 

pr of the centroid axes to arthonon 
simple structure” yielded three group pe 
and an attempt was made to amet a. 
factors in terms of the kinds of items me 
their highest loadings on the a = 
tors. While the particular ee ait 
present study may be restricted to t e a 
ture in which the data were collected, a 
felt that the method has been sufficiently sus 


H dy of 
gestive to warrant further use in the study 
accidents. 


Received November 25, 1958. 


REFERENCES 


= ndon: 
ALEXANDER, F. Psychosomatic medicine. Lo 
Allen and Unwin, 1952. m tistics 
ARrBoUs, A. G., & Kerricn, J. E. cg ee 
and the concept of accident proneness. 

1951, 7, 340-432, L. L. Com- 
Cuesme, L., SAFFIR, M., & Tuurstone, L. Jation c0- 
puting diagrams for the tetrachoric elite 1933 
efficient. Chicago: Univer. Chicago Pean York: 

Dunzar, F. Psychosomatic diagnosis. N 
Harper, 1943, Emotion 
a O. S, & Pearson, G. H. J. en) 
problems of living, New York: R aai of 
KUNKLE, E. C. The psychological bac an Med. 
“pilot error” in aircraft accidents. J. avi 
1946, 17, 533-567, 


al 


Chi- 
— ts. 
NATIONAL Savery Counc, Accident fac 

cago: Author, 1952. Psychosom: 
Rowson, A. J. Accident proneness. 

Med., 1944, 6, 88-94. ircraft accidents 
Wess, W. B. The prediction of peer Meds 

from pilot-centered measures. J. 


1956, 27, 141-147. 


ae Se 


Journal of Applied Psy 
Vol. 43, ga y 3: gone 


JOB SATISFACTION STUDY OF TWO SMALL 
UNORGANIZED PLANTS 


B. J. SPEROFF 


Lithographers and Printers National Association 


oe study utilized the Tear Ballot 
we wee General Opinions, a measure 
as been used and reported on fre- 
ser over the last fifteen years, on a group 
ma from two small independently 
a and unorganized plants. The test it- 
pet of 10 items relating to job Se- 
Works » company welfare, supervisory ability, 
ships A conditions, interpersonal relation- 
the mecome, communications, confidence in 
män intentions” and the “good sense” of the 
ts ai, and personal happiness. The 
ea time for the test is extremely 
lene it is completely anonymous, and the 
merely tears his answers right on the 
Sheet, = 


Table 1 
y H . 
I earsonian Coeflicients of Correlation Between 
Tenure Rate and Tear Ballot Items 


ee as a eee 


is Does the company make you feel that your 

job is reasonably secure as long as you do 

good work? 

Tn your opinion, how does this company 

Compare with others in its interest in the 

Welfare of the employee? 

How does your immediate supervisor com- 

Pare with other managers, foremen, or s¢c- 

tion leaders as to supervisory ability? 

Considering your work, are your working 
Conditions comfortable and healthful? 

+ Are most of the workers around you the kind 
Who still remember you when you pass them 


6 D the street? 
* 90 you think your income is adequate for , 
d 31 


7 Your living needs? 
90 you feel that you have proper oppor- 
oe to present a problem, complaint, or : 

g, pë8estion to management? : 84 
is, you have confidence in the good inten- 

9 2s of the management? : 
t is you have confidence in the good sense of m 

10, Wh Management? AT 
at effect is your experience with th 

pany, having upon your personal hap 


he com- 
piness? 


* 
De 
Notes significance at the 1% level of confidence- 


Subjects and Procedure 


The personnel of two small independently owned 
unorganized plants—one manufacturing lawn and 
porch furniture (N = 22) and the other hand-woven 
machine belts (N = 14)—were administered coded 
tear ballots. The job tenure rate (total years on 
the labor market divided by the number of jobs held 
during the same period) for each worker was com- 
puted and utilized “as an independent criterion 
against which to correlate the individual job satis- 
faction items of the Tear Ballot for Indusiry” (Kerr, 
1948, p. 279). Table 1 summarizes these correlates. 


Validity and Reliability 


The purpose of this study was to test the 
validity of the tear ballot on the premise that 
the higher the job satisfaction scores, the 
lower will be the job-related interpersonal 
communicative contacts between labor and 
management members. The number of job- 
problem sessions for a period of one year was 
thus utilized as the validation criterion based 
upon the hypothesis that the job-satisfied and 
happy worker has less job-related interview 
sessions than does the job-dissatisfied or un- 
happy worker; i.e., the frequency of such ses- 
sions should be inversely related with job 
satisfaction. Combining the data from both 
plants, a Pearsonian coefficient of correlation 
of —.76 was found between job satisfaction 


scores and the number of job-related inter- 


view sessions. 
Reliability was established by retesting the 


sample two weeks later. This coefficient of 
correlation was .81, as compared to the ac- 
cumulated mean of .83 reported by Kerr 


(1951). 
Received November 25, 1958. 


References 


d reliability of the 


Kerr, W. A. On the validity an 
J. appl. Psychol, 


job satisfaction tear ballot. 
1948, 32, 275-281. 
Kerr, W. A. Validation o 
faction. Amer. Psychologist, 


stract) 


f a measure of job satis- 
1951, 6, 360. (Ab- 


l of Applied Psychology 
aes No. 5, 1959 


NUMERICAL ERROR CHECKING 


E. T. KLEMMER 


IBM Research Center, Yorktown Heights, New York 


Numerical error checking has often been 
used as an item in clerical aptitude tests, but 
there has been little interest in the psycho- 
logical aspects of error checking itself. The 
present study addresses itself to two impor- 
tant questions about numerical error check- 
ing: (a) What is the effect of grouping digits 
on the speed and accuracy of error checking? 
(b) How does the probability of error affect 
the speed and accuracy of error checking? 
The present study was designed to provide 
answers to these questions for the situation in 
which S checks numbers on one Page against 
numbers on another page. This is fairly rep- 
resentative of many actual error checking 
situations, although some specialized tasks 
may require a very different spatial separa- 
tion of the numbers. 

Another aspect of error checking behavior 
which is of interest to the Psychologist is the 
rate of information Processing by S. Error 
checking requires so little overt output that 
speed is, for all practical purposes, completely 

limited by input and internal processing re- 
strictions. In this study estimates are made 


of the rate of handling information during 
error checking, 


Method 


The numbers to be checked w 
of separate 84-in, by 11-in, pa 


different, 

Every page contained 
a space between each group of 4 rows, 
of columns varied somewhat because o 
in the horizontal grouping (which Was one of the 
experimental variables), but all Pages were within 
the range of 32-40 columns so that the total number 
of digits per page varied only between 1024 and 
1280. All pages were printed by offset from Plates 
prepared directly on the printer output of an IBM 
704 computer. Figure 1 shows one actual page of 
a typical pair. The other page of the Pair differed 
only in that some of the numbers were different. 


32 rows of numbers with 


The number 
f differences 


Groupings 


yed 
Ten different horizontal groupings Ie ame 
involving groups of one through fen ae number 
space between groups was enlarged as order that 
of digits in each group was increased in about the 
the over-all matrix of numbers would be 
same width for all groupings. 


Error Probability 


Three different error probabilities were : 
01, and 001. Error probability is aana 
probability that any digit on one a position 
ferent from the digit in the TRER for error 
on the comparison sheet. For eann hundred 
probability .01 approximately one digit h Jt was 
was changed on one sheet of each ma the nine 
changed to a digit chosen randomly fro rrors were 
remaining digits. Note that since the € of errors 
determined Pprobabilistically, the number any test: 
Per page was not controlled exactly in used with 
Each of the three error probabilities me of 30 dif- 
each of the ten groupings, making a tota As Three 
ferent pairs of Pages, hereafter called te ilable and 
alternate forms of the 30 tests were ava 
are designated A, B, and C. 


d: 0-1; 
a the 
be dif- 


Subjects 


' in 
d as paid 55 
Volunteer college students serve 


all tests, 


Procedure groups 
Two separate studies were run with seria of GA 
of Ss. One group of 30 naive Ss took ge palanc? 
tests (pairs of pages) of Form A in order, bY 
design. That is, all Ss took all tests Ae order 0 
each S started with a different test. between tests! 
tests was such that grouping changed be during the 
ut error probability changed only luteke a serie 
experiment. Another group of four SS d 3 
of practice trials on tests of Form A an a thu 
of Form B and Form C. The four a eit probie 
took two tests for each grouping and n tests wie 
bility, but because of time limitations ae é jor 
grouping by 7 and 9 were omitted. Eac The arif 
Ss took the 48 tests in a different order. „probabili 
ing was symmetrical with respect to pre ire wi 
and grouping. Both studies were Puria wh 
Sessions approximately 40 min. long f test Poy, 
the Ss completed two to four pairs 0 ri oe 
depending upon grouping and eet Pe 
All Ss were instructed to work as fas pe 
and still check every number on the paga a to t 
taken for each pair of pages was reco 
nearest 4% min. for each S, 


316 


Numerical Error Checking 


652224 717452 721307 


Z63137 898511 T7416 
917130 25295 £77900 
208394 690439 596849 
058749 228479 625716 
325124 572238 017603 
158308 B14882 262375 
399360 236738 089768 
932332 385047 366607 
176042 24450 Z89437 
599807 12288 989432 
889379 34889% 875564 
#65376 814024 09%954 
587¥60 269673 115259 
650857 829599 402916 
390592 438921 32836Z 
8043231 823227 465204 
055603 144094 034627 
594699 16219 718395 
406928 153608 622106 
328257 884519 042820 
082071 341569 Z64807, 
308487 011902 417826 
655921 583611 058024 
252469 82898 #63397 
597419 354506 043857 
459462 368998 9Z9210 
838546 211770 032059 
423958 528336 548485 
#05198 2719245 129716 
150208 119233 256934 
006043 1521f6 888450 


Fic. 1. Typical comparison page of 
different. 


format, but some of the numbers were l y 
Page shown illustrates grouping 


digits which were different (errors). 
ability of 0.1, 


ate Ss indicated the errors by drawing a single 
© through the discrepant digits on one pase of 
=d Pair only. In the thirty-S experiment, each S 

informed of the actual number of errors on each 
À immediately after finishing the test. This knowl- 
Se of results was not given in the study with only 
d ° because the error keys were not available at 
Ime, 


Results 
Sheed 


in Tie results of both experiments ar 
ing S 2 and 3, which plot the speed of check- 
nl (pairs of digits compared per second) 
ae the number of digits in the horizontal 

“ps. Note that, in all cases, the speed is 


e shown 


a test pair. 
The slash marks were made by S to denote those 


317 


131044 113115 6 

159662 139440 eee 
748704 781410 08394 
683065 393043 058749 
849358 661910 325124 
721420 718621 158305 
963526 720803 335 0 
9049428 284637 432332 
687592 192854 176042 
060073 XX6087 599804 
H42684 879743 889379 
662420 939722 65376 
091207 048856 587460 
295187 061184 650852 
571801 Trias 390592 
898526 992352 804381 
353974 419206 055603 
749675 353393 594699 
1543 1% 199007 406828 
240686 060599 328257 
624317 534921 082071 
270936 238846 3084827 
343727 652604 655921 
859296 801884 252469 
797147 433417 597419 
102636 34141Z 459462 
390170 919348 838546 
249158 326358 423958 
904956 505658 605198 
885548 47488 150208 
THYOZI 964747 006043 
502009 065832 381340 


The standard page of the pair had the same 


by six with an error prob- 


lowest for groups of one digit, highest for 
groups of three or four digits, and then falls 
off again for larger groups. The average per- 
formance of both sets of Ss is such that speed 
of checking with groups of one is only 56% 
of the speed with groups of three, and speed 
of checking with groups of ten is 67% of the 


speed with groups of three. 


Accuracy 

The accuracy of checking did not change 
in any regular way with the size of horizontal 
g, but differences for the three error 


grouping 
probabilities were noted. A greater percent- 


age of errors is detected by the Ss when many 


w 
m 
w 


a 


30 Rat Ss 


oO 


xm 


X——x. 


J ee n 
/| | (ae ia 
# PERCENT | 
PROBABILITY NOT DETECTED 


x—.001 24 
o— ol 13 
+— 0l 4 


[0] 1 2 3 4 $ & * 8 S io 
DIGITS IN EACH HORIZONTAL GROUP 


a ee 


PAIRS OF DIGITS COMPARED PER SECOND 
N 


[e] 


Fic. 2. Speed of numerical error checking as a 
function of horizontal grouping and error probabil- 
ity. Each point represents the average of 30 Ss and 
is based on time scores for checking a pair of pages 
with about 1000 digits on each page. 


errors are present, as shown in Fig. 2 and 3. 
Correct detection is 96% for error probability 
-1, 87% for error Probability .01, and 16% 
for error Probability .001 in the thirty-S ex- 
periment. The four Practiced Ss showed a 


range from 98% to 83% for the same tests. 
Note that the nu 


detected is direct] 


consistent change with 
thirty-S experiment the n 
averaged 1% or less of t 
in tests with each of the error probabilities, 
The four practiced Ss made false marks num- 
bering only 0.4% of the total marks in the 


.1 error-probability test and no false marks 
in the other tests. 


grouping. For the 
‘umber of false marks 
he total marks made 


Informational Measures 


In the present study the following informa- 
tion processing operations are involved, First 
the S must locate, perceive, and store tempo- 
rarily one or more digits from the standard 


E. T. Klemmer 


page of each pair; locate and perceive the 
corresponding digits from the comparison 
page; then compare the two sets of digits to 
detect differences. Since there appears to be 
no simple way of estimating the information 
involved in locating the digits, spatial an 
formation will be neglected in the following 
analysis. , 

After locating any particular digit on the 
standard page, the perception and temporary 
storage of that digit represent assimilation 
of one out of ten equally likely alternative 
digits. In terms of Shannon-Weiner mean 
of information, this represents logs 10 or 3. 
bits of information. Since the arn 
digits are highly predictable from the —s 
ard digits, the information content of r- 
comparison digits is low. The rey age" 
tainty of a comparison digit is .02 bits, 4 
bits, and .79 bits for error probabilities .0 “ 
01, and .1, respectively, as calculated Er 
the probability distribution over all possi % 
comparison digits with knowledge of the a 
responding standard-page digit. The bien 
informational input for each pair of 
(neglecting position information) is thus ie 
Sum of the uncertainties of each digit of e 
pair or 3.34, 3.43, and 4.11 bits for Err 
Probabilities .001, .01, and .1, respectively- 

The total information input as SERR 
above is not actually processed by the S, t ay 
is, he does not perform perfectly. The Sm i 
make mistakes at any stage of locating, T 
ceiving, comparing, and marking the ‘eS 
We cannot accurately determine from a 
final results how many mistakes were ma is 
in each operation, but fortunately yore 
good evidence that the great bulk of eer 
are of a single class. Consider that mista i 
of perceiving, comparing, or marking ane 
almost certainly lead to more false posi nis: 
mistakes in marking the test page than n to 
takes of undetected errors since 90% osi- 
99.9% of the digits are subject to fee P 
tive mistakes and only 1% to 10% 0 ion- 
digits are subject to mistakes of ome 
The actual data from the Ss show that f 
positives make up only a small fraction 0 i 
total mistakes. This strongly suggests are 
most of the mistakes made by the SR o 
caused by failures to compare some P y 
digits at all. That is, most failures in t 


ha 


SE 
————————— ——_—____—_.. 


Numerical Error Checking 


performance are due to skipping blocks of 
digits on both the standard and comparison 
pages. Note that if S skipped a different 
number of digits on the two pages it would 
lead to many apparent errors close together, 
which condition the S knows is highly im- 
Probable. Therefore, this type of skipping is 
largely self-correcting. 

If the undetected errors are considered due 
to the S's failure to compare the digits in- 
volved, then it may fairly be assumed that 
the S also fails to compare many of the digits 
which are actually the same. His over-all 
Speed of performance may therefore be rea- 
Sonably reduced by the proportion of un- 
detected errors. Clearly this correction should 
be applied separately to each S and Test but 
for the -001 error-probability test (where the 
Correction is most important) the number of 
errors is so small (zero to four per page) that 
no reasonable estimate of percentage of un- 

etected errors is possible for each S sepa- 
Tately, The correction for undetected errors 
'S therefore made simply by multiplying the 
Average speed of checking by the proportion 
of errors correctly detected. Since the per- 
Centage of correctly detected errors shows no 
regular variation with grouping, the percent- 
ages are also averaged over groupings. 


5 


4 PRACTICED Ss 


a 


w 


n 


PAIRS OF DIGITS COMPARED PER SECOND 


o Tea 4 5 6 7 8 g0 
DIGITS IN EACH HORIZONTAL GROUP 

fur: 3. Speed of numerical error checking as a 

ity ton of horizontal grouping and error probabil- 

is p, êh point represents the average of 4 Ss and 

Sed on time scores for checking two pairs of 


Da e 
Ses with about 1000 digits on each page. 


319 


x—.001 


a l4 

z 

S 

9i 

on 

& 10 

a 

pe 

a 

z 6 

z PROBABILITY 
pt | +01 
E —0. 
a 

$ o—ol 
ž 2 

e 

= 


o 


1 23 4 5 6 7 8 9 #10 
DIGITS IN EACH HORIZONTAL GROUP 


Fic. 4. Rate of information handling in numerical 
error checking as a function of horizontal grouping 
and error probability. See text for a description of 
the method used in deriving informational rates from 
the speed scores of Fig. 2 and 3. 


The speed of checking, as shown on the 
ordinates of Fig. 2 and 3, was corrected for 
mistakes as described above and then multi- 
plied by the total information inputs 3.34, 
3.43, and 4.11 bits as derived above. Fig- 
ure 4 plots the informational rates against 
grouping for both sets of Ss and all three 
error probabilities. Figure 4 shows that the 
Ss were working at approximately the same 
informational rate even though the error 
probability varied over a 100: 1 range. The 
difference between the four practiced Ss and 
the 30 unpracticed Ss is maintained. The 
shape of the plot of information rate against 
grouping is the same as the shape of the un- 
corrected speed curves of Fig. 2 and 3, since 
the correction to the information measure is 
not a function of grouping. 


Summary and Conclusions 


Speed and accuracy of numerical error 
checking were studied as a function of the 
probability of randomly placed errors and 
horizontal grouping of the digits. Ten group- 
ings (1 through 10 digits) and three error 
probabilities (0.1, .01, and .001) formed the 
basis of tests given to four practiced Ss and 
30 naive Ss. 

The speed of error checking was highest 
for groupings of three or four digits and fell 
off for smaller or larger groups. Compared 
with grouping by three, groups of one digit 


320 


were checked an average of 44% slower and 
groups of ten were checked an average of 
33% slower. 

Speed of checking was inversely related to 
error probability so that the .001 error-prob- 
ability tests were checked most rapidly and 
the 0.1 error-probability tests, most slowly. 
This increased speed on the low error-prob- 
ability tests was accompanied by a higher 
percentage of undetected errors so that the 


E. T. Klemmer 


Ss were handling information at about the 
same rate with all three error probabilities. 

The accuracy with which Ss checked errors 
showed no identifiable variation with hori- 
zontal grouping, even though speed of check- 
ing varied greatly with size of horizontal 
group. The great majority of the S’s mis- 
takes were failures to detect errors actually 
present. 


Received December 1, 1958. 


Journal of Applied Psychology 
Vol. 43, No. 5, 1959 = 


COGNITIVE SIMILARITY AND INTERPERSONAL 
COMMUNICATION IN INDUSTRY? 


HARRY C. TRIANDIS ? 


Cornell University 


The present paper reports a test of the hy- 
Pothesis that cognitive similarity affects the 
Process of interpersonal communication. It 
Presents methods for the measurement of cog- 
nitive similarity and shows that the measures 
obtained are related to perceived effectiveness 
of communication and liking between two 
People. Since permanent, long-standing rela- 
tionships were necessary for purposes of the 
study, supervisors and subordinates in indus- 
tty were used as Ss. Other pairs, such as 
child-parent, therapist-patient, or student- 
teacher, could have been used, though each 
Presents special difficulties. A laboratory 
replication of the study has been reported 
elsewhere (Triandis, 1959a). 
_ Two kinds of cognitive similarity are con- 
sidered, The first, categoric similarity, is ob- 
tained by comparing the categorizations of 
two Ss, through an adaptation of Kelly’s 
(1955) Role Repertory Test. The second, 
SYndetic similarity, is obtained by comparing 

© ways concepts are associated with other 
concepts, and uses Osgood’s (1952) semantic 

Merential, 

Recent studies of perception (Hayek, 1952) 
thinking (Bruner, Goodnow, & Austin, 
1956 have emphasized the importance of 
Categorization. If categorization is central to 
i ese processes it should also be important in 
Rterpersonal communication. That is, if two 
» °Ople Categorize events, objects and concepts 

Similar ways they should be able to com- 

unicate more effectively. : 

( he work of Osgood and his associates 
Osgood, Suci, & Tannenbaum, 1957) stresses 
© importance of the “semantic space” in 
Phenomena related to attitudes and communi- 


lp: i 
This Paper is based on portions of the writer’s 


doct, 
Oral dissertation., The author gratefully ac- 
Ow] prae 4 . Lam- 
ber "edges the guidance and help of W a Tage 


A. Ryan, and W. F. Whyte. ae 
» Of which this is a part, was Supported 7a 
Chavign the Foundation for Research on Huma 
‘or, 


2 3 
Now at the University of Ilinois. 


cation. It seems a reasonable hypothesis that 
if two people have similar “semantic spaces” 
they should be able to communicate more 
effectively. 

Cognitive similarity is related to additional 
variables. Newcomb (1953, 1956, 1958) sug- 
gests the following model: If A and B are 
cognitively similar and there is an opportunity 
for communication (propinquity), the com- 
munication will be more effective, the rela- 
tionship between A and B will be more re- 
warding, and A and B will therefore like each 
other more than if A and B are not cogni- 
tively similar. Cognitive similarity implies a 
similar orientation towards X, in Newcomb’s 
A-B-X model. Increased liking leads to higher 
rates of interaction between A and B and this, 
in turn, permits greater cognitive similarity 
thus starting the cycle all over again. 

This paper relates categoric similarity and 
syndetic similarity to perceived communica- 
tion effectiveness and liking of the supervisor 
by the subordinate. The hypotheses that are 
tested may be stated as follows: (a) The 
higher the communication effectiveness be- 
tween supervisor and subordinate, the more 
the liking of the subordinate for the super- 
visor. (b) The higher the categoric simi- 
larity between the supervisor and subordinate 
the greater the communication effectiveness 
and the more the liking of the subordinate 
for the supervisor. (c) The higher the syn- 
detic similarity between the subordinate and 
the supervisor, the greater the communication 
effectiveness and the liking of one for the 


other. 


Method 


The study was conducted in an industry employing 
300 people. Approximately one half of the em- 
ployees participated in the study. Details on the 
company and the Ss can be found in Triandis (1958 
or 1959b). , 

Procedure. (a) Categoric similarity: Twelve triads 
of jobs and 12 triads of people were presented to the 
Ss (see Triandis 1958 or 1959b for exact jobs and 


322 Harry C. 


Table 1 


Intercorrelations between the Main Variables 


Kp 0; Op C. L 
c ee 5 -20 14 -04 
a i a á © æ 
ae “a — 50** 45** 49** 
A = 34 Al* 
a — si 
4 a 
L 
10. 
=} fios. 
people). The Ss were asked: “Which one of these 


three jobs (people) is more different from the other 
two?” “Why?” and “What is the logical opposite 
of the characteristic that makes it different ?” Thus, 
we obtained lists of characteristics of jobs and peo- 
ple and their logical opposites. These lists were then 
subjected to a content analysis and rated as to their 
similarity by the two judges. The corrected inter- 
rater reliability was .92 for people and .87 for jobs. 
The instructions to the raters and the rating scale 
can be found in Triandis (1958 or 1959a). (b) Syn- 
detic similarity: A semantic differential was con- 
structed for jobs and another one for people. Most 
of the scales of these differentials were relevant to the 
concepts that were to be rated on them. Twenty- 
eight of the scales of the differentials were obtained 
from a stratified random sample of the lists of char- 
acteristics obtained from the categoric similarity pro- 
cedure described under (a) above. Ten additional 
scales were selected so as to represent the seven fac- 
tors of Osgood et al. (1957, pp. 62-64). Eleven con- 
cepts were judged against these scales, 


They were: 
a welder’s job, a teacher’s job, a personnel director's 
job, a vice president’s job, and a clerk’s job. The 


sequence of these jobs was counterbalanced, for every 
group of Ss. The People-concepts used were: Dick 
T. (the personnel director of the company), your 
supervisor, the boss of your supervisor, the vice 
president of your division, a fellow at work whom 
you like, and an effective manager you have known 
well and who is not the same as any of the men al- 
ready rated. The instructions as wel 


Triandis 


i jandis 
semantic differentials used may be found in Si af 
(1958, pp. 296-298). The test-retest ee 
the differentials for 20 workers was .83 and .92. 
syndetic similarity was computed from 


n 


Ze 
g Cm 
Bab Gea tea 


where is the number of scales over which parte 
ference d between the ratings of the two Ss is ey 
summed. The constant 36 = 6° is due to Se were 
a seven-point scale. Five jobs and three ‘peop! arity 
used in the computation of the syndetic simil ie 
coefficients. (c) Communication a cS for 
liking scales. Two scales were constructed, on aie 
each variable. The Thurstone method of steve = 
intervals (Edwards, 1957) was used. The item i108 
scale values can be found in Triandis (1958, PP cide 
112), The parallel form reliability of these e The 
scales, using 45 college student Ss, was Sa ss and 
scales were subjected to a scalogram analya 8 
yielded Guttman coefficients of reproducibility highly 
and .88, respectively. The two scales wer for 31 
intercorrelated. For 31 female clerks r = .76; d for 
male clerks r= .84; for 42 managers r = .83 an 

51 workers r= 92. 


Results 


Correlational analysis. Since we considered 
independent pairs of supervisors and me 
nates we could only use 20 such pairs for ter- 
analysis. Table 1 shows the matrix of in 
correlations. s torôor: 

Factor analysis of the matrix of intere ns 
relations. The matrix of intercorr te 
which consists of the correlation coefficie o 
of variables Kj, Ky, O;, and O, as well le 
and L was factored by means of Thursto ete 
(1947) centroid method. Three factors peat 
extracted and rotated for simple age wn 
The unrotated and rotated matrices are sho 
in Tables 2 and 3. 


3% 
ll as the exact The first factor, which accounts for 31 
Table 2 
Unrotated Factor Matrix, of Matrix of Intercorrelations between Our Main Variables 

Variable a A a a? a? as? 

rca 9 

Cätegoric similarity—jobs (K;) 419-510 222 476 260 pe 

Categoric similarity—people (K,) -566 ~ 439 —.145 320 192 ‘oat 

Syndetic similarity—jobs (0;) -628 -379 —.203 394 143 ‘054 

Syndetic similarity—people (0,) .625 143 .232 391 021 ‘045 

Communication effectiveness (C,) O47 211 212 419 2045 ‘062 
Liking for supervisor (Z) 667 335 .248 445 112 : 


Cognitive Similarity and Interpersonal Communication 


323 


Table 3 


Rotated Factor Matrix of Matrix of Intercorrelations between Our Main Variables 


Variable a as a3 a? a? a? je 
K; .670 204 —.006 458 041 .000 491 
Ky 690 186 143 485 .035 .013 533 
O; .096 405 605 .009 164 364 537 
Op .276 .394 456 .076 154 .210 400 
e .237 .001 654 .056 .000 430 486 
L 143 —.003 747 .020 .000 560 580 

31.3 13.7 55.0 100.0 


Percentage of total variance 


of the variance accounted for is a categoric 
Similarity factor. The second factor accounts 
id 13.7% of the variance and may be called 
a syndetic similarity factor. The third factor 
accounts for 55% of the variance and is satu- 
fated with L, C,, Oj, and Op. It may be called 
a evaluative factor. 

he regression equations. The means and 
Standard deviations of the six main variables 
ate presented in Table 4. 

Using the standard methods for the deter- 
mination of regression equation (McNemar, 
1949, Chap. 9), including Doolittle’s method, 
We obtained the following equations, expressed 
™ standard form. 


z'o, = .0001 zg; + -289 ex, uJ 

z'o, = 373 zo, + .153 20, [2] 

z'n = 168 2x, + 428 2x, [3] 

2'r, = .380 zo; + -221 20, [4] 
C= — 1.6K, 49.4 Kp + 38.5 0; 


+.5.20, — 33.6 [5] 


Wi 
"th an error of 2.2. 


ie 
~~ 1.9K, + 3.2K, +930; 


+190, —248 [6] 


Wi 
th an error of .45. spaan E 
~~ We multiply both sides of Equation LO] 
W piy bo 3 i 
a -15, to equalize the coefficient of O;, we 
4 ^ Compare [5] and [7] more conveniently. 
ISL — 79. K,413.2Kp+3850; 
+7.90, — 10.2 [7] 
lig Bus, the communication effectiveness and 


for supervisor scores can be predicted 
the knowledge of the categoric simi- 


larity, and syndetic similarity coefficients. 
The multiple r for the liking for supervisor 
scores is .61 (p < .003), and the one for the 
communication effectiveness scores is .51 (p 
< .02). The most effective predictor of either 
communication effectiveness or liking for su- 
pervisor is the syndetic similarity for jobs. 
The second most effective predictor is the 
categoric similarity about people. The other 
two cognitive similarity coefficients are in- 
effective. 

The analysis of variance. The correlation 
procedures that gave the results reported 
above have one great deficiency; they waste 
data. Each correlation is based on only a 
few supervisor-subordinate pairs because of 
the requirements for independence. Since re- 
sults based on small samples are less convinc- 
ing, and significant relationships are not easily 
obtained with such samples, it is desirable to 
use other statistical procedures. Analysis of 
variance is the appropriate technique. If each 
supervisor is considered a different “treat- 
ment,” then it is possible to use many more 
Ss in our computations. We have, then, two 
classifications of the data; one according to 


Table 4 


Means and Standard Deviations of Variables 


Variable M SD 

Categoric similarity—jobs (K;) .128 078 
Categoric similarity—people (Kp) .097 064 
Syndetic similarity—jobs (O;) 920 024 
Syndetic similarity—people (Op) 926 049 
Communication effectiveness (C) 7.38 2.55 
5.77 58 


Liking for supervisor (L) 


Harry C. Triandis 
324 


Ta 


ble 5 


The Results of the Analyses of Variance—Summary 


Level of Percentage ‘“ 
yi N used in 
Significance of Total # ae 
‘nati We Mea Saal sedi 
ana 
Double Classification Analyses i“ 
7 
z G p < .02 6.0 
3 H Fa L p< 01 6.6 
SHK. G NS. 0 a 
SHK Í NS. 0 7 
SHK c p <10 5.7 = 
S+ Key L NS. 23 fe 
546, c b < .001 6.6 u 
S+0; L p<.01 49 Ta 
S+ O, C: N.S. 2.3 ae 
S+0, L NS. 15 1 
S + Ori Ce p < .025 5.9 i 
S + Opi b < 03 4.9 
Triple Classification Analyses 
5 53 
i+ K,C, p < .125 (for O) 2.6 3 
TEREA «200 (for K) 2.2 » 
r+ K;C, b < 30 0 
stot b < 001 (for 0) 7.1 53 
j N.S. (for K) 3 


Note—S = supervisor; K = categoric similarity; O = s; 
scores; Ce = communication effective 

cognitive similarity, 
supervisor. 
of subordina 
however, we 


the other according to 
Since there is a variable number 
tes reporting to each supervisor, 
have unequal subclass n’s. Also, 
in some cases we have had missing cells (all 
subordinates of a given supervisor were either 
very similar, or very dissimilar). We avoided 
a large number of missing cells by excluding 
from our analyses supervisors who had only 
one subordinate for whom we had complete 
data. Even with this restriction, however, 
we had a number of missing cells—in other 
words, we did not have the standard type of 
analysis of variance, Analyses of variance 
with missing cells and unequal n’s are de- 
scribed in Snedecor (1956, Pp. 382-385). 
About twenty such analyses were undertaken, 
The most interesting will be discussed below. 

A triple classification analysis of variance is 
particularly suitable for our data (effect of 
categoric similarity, syndetic similarity, and 
supervisor). Such analyses „were not avail- 
able for unequal #’s and missing subclasses 
when the analyses were first undertaken, 


Professor C. R. Henderson, of Cornell’s De- 


> P average ofj +? 
yndetic similarity; p = people; j = jobs; j + p = average 
ness; L = liking for supervisor; and N.S, = n 


nonsignificant, 


partment of Animal Husbandry, a mathema' 
cal geneticist, solved the problem after a ! 
quest from this writer, 1 the 

Table 5 presents a summary of al hese 
analyses of variance. The results of the 
analyses are as follows: 


1. Categoric similarity based on people 
significantly related to both communicat! 
effectiveness and liking for supir rom, the 
takes care of 6.0 and 6.6% respectively 0 
Variance of scores, aia Ye WOE 

2. Categoric similarity based on jobs 18 ven 
significantly related to either communicati 
effectiveness or liking for supervisor. larity 

3. If we average the categoric sunt aoe 
Scores we can predict communication & ies 
tiveness, accounting for 5.7% of the varian 
but not liking for su ervisor. A 

4. Synetia stieartty about jobs is hight 
related to both communication effective" é 
and liking and accounts for 6.6 and 4.9% 
the variance. «ine 

5. The results of the triple classification in 
dicate that syndetic similarity is a much y i 
important variable than categoric similarity 


| 


Cognitive Similarity and Interpersonal Communication 


Table 6 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


325 


Table 8 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


(Management Group Only) (Workers) 
Per- Per- 
centage centage 
o f 
Source SS df Variance F Variance Source SS df Variance F Variance 
Supervisor 646.45 17 480 445** 60 S 45.72 3 15.24 257 14 
S: 91.44 1 91.0 8.40* 9 O; 10.91 1 10.91 1.84 3 
“aes be 70.35 11 64 7 SXO; 25.26 3 8.42 8 
ndividual Individual 
Differences — — 108 24 Differences — 41 5.92 75 
Total 1070.00 76 156.2 100 ETA 
wn? S01, : 
b < iooi, group, less clear with the clerks and least 


In addition to these analyses, in the case of 
Syndetic similarity on jobs (O;) we have 
rough cases to make separate analyses for 

orkers, clerks, and managers. 
ji able 6 presents the analysis of variance 
esults for the management group; Table 7 
or the clerks, Table 8 for the workers, and 

able 9 for all groups combined. 


Discussion 


tha mination of Tables 6, 7, 8, and 9 shows 
rt both differences in level of syndetic simi- 
"ity about jobs and differences in supervisor 
me thine portions of the variance of com- 
0 nication affectiveness scores. This phe- 
Menon is most clear with the management 


A Table 7 
halys; P " 
alysis of Variance of Communication Effectiveness 


cores Classified According to Supervisor and 
evels of Syndetic Similarity About Jobs 


(Clerks Only) 


clear with the workers. One is tempted to 
generalize that the extent to which the job 
held by the Ss is “intellectual” determines 
the influence of syndetic similarity about jobs 
on the communication scores. It may be that 
when a S has a manual job, his perception of 
that job and other jobs is not very important 
in terms of communication with his super- 
visor. Very often the worker takes a job that 
pays X dollars per hour and is not very con- 
cerned with the nature of the job. The su- 
pervisor tells him what the job is and he does 
it. With professional jobs, however, such as 
with engineers or managers, differences in the 
perception of jobs between supervisor and 
subordinate appear to be crucial. 


Table 9 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


(All Groups) 


Per- Per- 
centage centage 
of ; i of 
Source SS df Variance F Variance Source SS df Variance F Variance 
a yd 5144 7.55** S52 
a = 0 5 868.69 17 S144 755 2 
0, = a te 2 o 110.65 1 110.65 16.24** 7 
Tre 0% oe 4 838 18S XO; 150.44 7 2149 3.15" 9 
nid 7 i Individual 
'ferences s = 19.98 37 Differences — 112 6.81 32 
Tot = 
al 5 100 01. 
| 192.59 21 65.28 om 87% 


326 Harry C. 

Explanation of the relative effectiveness of 

the four indices of cognitive Similarity, Syn- 
detic similarity based on jobs and categoric 
similarity based on people were the only in- 
dices that were related to communication ef- 
fectiveness and liking. This requires an ex- 
planation. . x aes 

C. E. Osgood, in a private communication, 
suggested that the difference in the effective- 
ness of the two syndetic similarity indices is 
due to differences in the representativeness of 
the concepts rated. He argued that the jobs 
used in the computation of the syndetic simi- 
larity coefficients for jobs were more diverse 
and representative. They were welder, teacher, 
vice president, clerk, and personnel director. 
The people used in the syndetic similarity 
coefficients for people, on the other hand, 
were more homogeneous. They were the per- 
sonnel director, a supervisor, and the vice 
president of the employee’s division—all “su- 
pervisory.” This seems a reasonable explana- 
tion. It suggests further research to establish 
whether in fact one would get an even higher 
correlation between liking and syndetic simi- 
larity when extremely diverse concepts are 
rated by two Ss. One might conceivably ex- 
tend this explanation to explain also the 
greater effectiveness of categoric similarity 
based on people as compared to the cate- 
goric similarity based on jobs. 

There is then, some evidence that certain 
kinds of cognitive similarity are related to 
communication effectiveness and liking be- 
tween two Ss. Whether this is a specific or 
a general phenomenon js subject for further 
research. A laboratory test of the hypothesis 
tested in the present paper (Triandis, 1959a) 


suggests that it is a sufficiently stable phe- 
nomenon to deserve further study. 


Summary 


One hundred and fifty-five Ss responded to 


12 triads of jobs and 12 triads of People. The 
Ss were asked to state “Which job (person) 
is more different from the other two?” and 
“Why?” The responses of subordinates and 
supervisors to these triads were compared by 
two judges. If the responses were judged to 
be similar the index of categoric Similarity of 
the pair was high. The same Ss were 


asked 
to rate five jobs and six people on specially 


Triandis 


constructed semantic differentials. “imila 
of the “semantic profiles” obtained indica 
high syndetic similarity between a boss e 
a subordinate. Successive interval scales ie 
perceived communication effectiveness a 
liking within the boss-subordinate pair W a 
also constructed. Correlational analyses a 
analyses of variance showed an sealant, 
between categoric similarity based on penk d 
and syndetic similarity based at ed 
communication effectiveness and liking wit a 
the pair. This is considered evidence ee 
porting the hypothesis that cognitive siml a 
ity is a significant variable in interperson 
communication and liking. 


Received January 22, 1959. 


REFERENCES 


BRUNER, J. S, Goopxow, J. J, & Austy, G. A. 4 
study of thinking. New York: Wiley, hah p 
Epwarps, A. L. Techniques of attitude = arte 
struction. New York: Appleton-Century-C™ 
1957. n HAIG 
Hayek, F. A. The sensory order: An inquiry Chi- 
the foundations of theoretical psychology. 
cago: Univer, of Chicago Press, 1952. the 

Jenxivs, W. L. A quick graphic method ee 
Product moment “y” Educ. Psychol. Mea 
1945, 5, 437-443. york: 

MCcNEMAR, Q. Psychological Statistics. New York 
Wiley, 1949, a of 

Newcome, T. M. An approach to the stu! aoe 
communicative acts. Psychol, Rev., 1953, 60, 

404. i sonal 

Newcome, T. M. The prediction of internas B6 
attraction. Amer, Psychologist, 1956, 11, si cog“ 

Newcome, T. M. The cognition of persons as Per- 
nizers. In R. Tagiuri & L. Petrullo ERE y 
son perception and personal behavior. Stan 
Stanford Univer. Press, 1958. nt of 

Oscoop, C. E. The nature and measureme 
meaning. Psychol, Bull., 1952, 49, ee H 

Oscoon, C, E. Sucr, G. J., & TENENDO  ivel 
The measurement of meaning. Urbana: 
of Illinois Press, 1957, Towa! 

SNEDECor, G. W, Statistical methods. Ames, 

Towa State Coll, Press, 1956. hicag?’ 

Tuurstong, L. L, Multifactor analysis. Chi 
Univer. of Chicago Press, 1947. 

Trranpis, H. C. Some cognitive factors 
communication, Unpublished doctoral 
tion, Cornell Univer., 1958. 

Trranpis, H, C. Categoric similarity and the 
munication of the dyad. Sociometry (in P 
1959. (a) ager 

Trranpis, H. C, Categories of thought of ron in 
clerks and workers about jobs and peony (v) 
dustry. J. appl, Psychol., 1959, 43, 338-344- 


affecting 
dissert” 


com” 
ress)! 


| 


Journal of Applied Psychology 
Vol. 43, A te 1939 ceaias 


A FEMININITY ADJECTIVE CHECK LIST ' 


RALPH F. BERDIE 


Student Counseling Bureau, University of Minnesota 


In spite of valiant attempts to construct 
theories to explain the development of voca- 
tional interests (Bordin, 1943; Darley & 
Hagenah, 1955; Strong, 1943; Super, 1953), 
no available theoretical basis allows us to 
Predict the scores a child will obtain as an 
adult on the Strong Vocational Interest Blank. 

he purpose of this paper is to suggest briefly 
an approach to vocational interest theory con- 
Struction, and then in more detail to describe 
the development of an instrument devised to 
ald in such theoretical exploration. 

Scores on vocational interest blanks are 
Closely and intimately related to other meas- 
Urable aspects of personality, and a theory 

Vocational interest development must be 
regarded as a part of the broader theory of 
Personality development. The aspects of be- 
‘ior which are elicited through the use of 
pcational interest blanks are called voca- 
‘onal interests because the items in the blank 

to refer to vocations, and the develop- 
Ment of the blank makes use of groups of 
Persons segregated upon the basis of occupa- 
of F urthermore, these blanks for the most 
et are used to help persons make decisions 
fy ering occupational choices. It is doubt- 
A however, if a structured organization of 
pe sonality dynamisms or behaviors that can 
is Called vocational interests exists within the 
er vidua] apart and discrete from other pat- 
Sa nS of personality organization. One might 
len that as we view personality through the 
te S of a vocational interest blank, and as we 
ee what we see to the present and future 
ar “Pational behavior of the individual, we 
leng. Serving vocational interests, but it a 
than and our purpose of observation rather 
the the personality structure that m 

Vocational interests. f 
Corp, this definition of vocational oe 
ep c then any theory which aids in the 

Planation of interests must make use of 

1 * 

Sragitis Study was supported w 


engg ĉte School and the Office o 
’ “niversity of Minnesota. 


ith funds from the 
f the Dean of Stu- 


concepts that also have power to explain the 
development of other aspects of personality. 
In other words, the concepts that best will 
explain the development of vocational inter- 
ests must also explain how this development 
is a part of the total development of person- 
ality. They must allow predictions to be 
made concerning vocational interests on the 
basis of various types of personal behaviors, 
and in turn must maximize the consistency 
between vocational interests and these other 
behaviors. 

A basis for constructing a theory of voca- 
tional interests might be found if two types 
of concepts were employed experimentally, a 
concept of dimension and a concept of proc- 
ess. We are accustomed to working with both 
types of concepts in psychology. Dimensions 
include such things as ability, sociability, 
rigidity, and masculinity-femininity. Proc- 
esses include such things as identification, re- 
pression, self-acceptance, and perception. We 
propose to test whether an analysis of voca- 
tional interests using a few carefully selected 
dimensions and processes will help explain 
some of the underlying source of variance of 
interests and personality. Dimensions we 
hope to eventually study include ability, 
masculinity-femininity, sociability, rigidity, 
and socioeconomic status. The processes to 
be considered include identification with other 
persons, self-acceptance, and perceptual dis- 
crepancies, the latter including discrepancies 
between self, others, and occupational stereo- 
types. 

The dimension selected first for study was 
that of masculinity-femininity. Much is 
known about this dimension and, more im- 
portantly, it has been used frequently in theo- 
retical discussions of both vocational interests 
and personality. An individuals behavior 
and his occupation are influenced by a va- 
riety of roles, roles defined by his race, his 
religion, his family, and his peers, but per- 
haps no role plays so important a part as the 
role defined by the person’s sex. Much of 


327 


328 


what a person does, even those things not 
directly related to sexual behavior, is influ- 
enced by his perception of how other persons 
of his sex behave and by what he considers 
appropriate behavior for persons of his sex. 

Counselors and clinical psychologists work 

with many persons who have failed to achieve 
a satisfactory definition of their sexual roles. 
Many men are reluctant to show affection, 
to accept their own emotional experiences, 
and to modify their own dominant and sub- 
missive behavior patterns appropriately for 
given situations. Many men and women face 
problems of vocational choice related to con- 
fusion of sexual role. 

Several methods are available for measur- 
ing psychological masculinity-femininity. Per- 
haps the most complete and comprehensive 
report of a measuring instrument is contained 
in Terman and Miles’ book (1936). The 
Strong Vocational Interest Blank, the Minne- 
sota Multiphasic Personality Inventory, and 
the Rorschach all provide scores related to 
masculinity-femininity. Other methods, both 
projective and inventory, have been developed 
to assess masculinity. All methods, however, 
which have been demonstrated to have both 
reliability and validity are rather cumbersome 
for use in large scale research designs, and the 
first problem faced was to develop a means 
for assessing masculinity-femininity which 
was fast, easy to administer, efficient, and 
Possessing the necessary reliability and valid- 
ity for research Purposes. 


The Development of an Adjective Check List 


The adjective check list h 
certain other methods, It is easily administered and 
scored, it has been demonstrated to have some 
validity, and perhaps most importantly, Ss are not 
reluctant to perform this kind of task. Therefore 
an adjective check list was prepared consisting, for 
the most part, of adjectives found in previous 
studies related to masculinity-femininity, The Gough 
Adjective Checklist was used as a starting point 
(Gough, 1955). This list contains 300 adjectives, 
many of which have been found at the University of 
California to be related to masculinity-femininity 
scores on the Strong Vocational Interest Blank, the 
Minnesota Multiphasic Personality Inventory, and 
the California Psychological Inventory. Review of 
other studies of masculinity-femininity resulted in 
the addition of a few items to the list. A check list, 
as finally used, consisted of 148 adjectives arranged 

in four columns. It is shown in Fig. 1, 


as some advantages over 


Ralph F. Berdie 


The standardization sample consisted of a enone 
of 600 students asked to complete the check list F 
the summer of 1955, prior to their matriculation 
freshmen at the University of Minnesota. This ero 
of students was given the check list five times “ad 
five different instructions. First, they were e a 
to check each adjective thought to apply to oy ae 
selves. Next, using another copy of the same “aves 
list, they were instructed to check those aoe hike 
which described the kind of persons they woul mat 
to be. Then they checked adjectives thought sae 
descriptive of the average person of their age be 
sex. Next were checked adjectives considered sor 
descriptive of father and finally adjectives eo? 
ered most descriptive of mother, Thus, five a Tae 
tive check lists were available for cach S, a lisi ther. 
self, for ideal, for average, for father, and for mo NET 

The standardization sample included 200 na 
freshmen in the College of Science, Literature, = 
the Arts, 200 male freshmen from that collese: salts 
200 male freshmen from the Institute of saat 
nology. These Ss averaged 18 years of age, ach 
all of them were graduates of Minnesota | dy 
schools, and almost all of them were per 
selected students coming from the upper — om 
their high school classes. About one-half came ea 
metropolitan areas, the remainder from small cities, 
towns, and farms, 

The groups of 200 SLA men and of 200 IT me 
were each divided at random into two groups aR 
100 in each group. The first 100 SLA men j 
known as the standardization SLA group, the tion 
first 100 IT men were known as the standardiza er 
IT men, and the two remaining male groups W' 
known as the nonstandardization groups. and 

In order to develop the scale, the 100 SLA “aie 
100 IT men in the standardization groups a, 
combined and compared to the 200 SLA WE at 
Item response frequencies on the self-descrip' 8 
were determined for the two groups on the 
items. When the significance of the differences Y 4 
analyzed, 15 items were found to be checked 4 
frequently by the men than by the women “ied 

level of significance at 05, and 46 items were chac 0 
Significantly more often by the women. Thus, the 
the 148 items significantly differentiated DEUWEER ag 
two criterion groups. The significance of pacts ke 
Was checked using critical ratios, the Lawshe-bé 
Nonomograph, and finally chi-square tests. sta 

The scoring scale finally adopted included 46 1 
which were given positive weights of one ani itive 
items given negative weights of one. Given E ' 
Weights were Items 2, 8, 9, 18, 19, 26, 36, Hy 
43, 44, 47, 48, 53, 54, 59, 62, 72, 76, 82s m 
87, 89, 99, 102, 108, 109, 110, 111, 112, 115» 1 
118, 119, 120, 125, 127, 129, 133, 135, 137, iaa 15) 
146, and 148. Given negative weights were 3, 1 "an 
1s pS» 8% 7%, 80, 88, 104, 114, 126, 136 Sro 
145. The scale thus was really a femininity =“ of 
rather than a masculinity scale insofar as mF at 
the items which determined the score were om 3 
which were marked more characteristically by we or? 
than by men. The possible range of scores was 


was 


ms 


active 
1 


affectionate 


alert 
4 


o 
(m 
(m aggressive 
o 
o 


ambitious 
5 


anxious 
6 


argumentative 
7 


appreciative 
8 


o 
Qo 
O artistic 
9 
O assertive 
10 
O athletic 
ll 
O autocratic 
12 
O boisterous 
13 
O bold 
4 
O calm 
15 
O capable 
16 
O cautious 
17 
O charming 
18 
o cheerful 
9 
O civilized 
20 
o clearthinking 
clever 
22 
coarse 
23 
cold 
24 
commonplace 
25 
complicated 
26 
confident 


27 
conscientious 
28 


29 
considerate 
30 


contented 
31 


conventional 
32 
cool 
33 
courageous 
34 
cruel 
35 


curious 
36 


demanding 
37 


o 

o 

m] 
0 
o 
o 
o 
O conservative 
o 
E 
E 
Ei 
o 
g 
Ci 
Oo 


Fic, 1, Adjective check list from w 


A 
o 
ag 
Oo 
o 
o 
oO 
E 
oOo 
o 
o 
[m] 
El 
o 
Oo 
m] 
o 
E 
oO 
o 
o 


(m 
o 
o 
o 
ag 
(m 
o 
o 
Oo 
o 
o 
(m) 
a 
oO 
o 
o 


oO 


Femininity Adjective Check List 


dependent 
38 
determined 
39 
distrustful 
40 
dominant 
41 
dreamy 
42 
effeminate 
43 
emotional 
44 
enterprising 
45 
fair-minded 
46 
feminine 
47 
flirtatious 
48 
forceful 
49 
foresighted 
50 
fussy 
51 
gentle 
52 
raceful 
aS 
ious 
graci 
greedy 
55 
hasty 
56 
helpful 
57 
hostile 
58 
humorous 
59 
imaginative 
60 
impatient 
61 
impulsive 
62 
independent 
63 
industrious 
64 
initiative 
65 
insightful 
66 
intelligent 
67 
interests narrow 
68 
interests wide 
69 
intolerant 
10 
jolly 
11 
kind 
12 
leisurely 
3 


logical 
“a 


o 
o 
o 
o 
o 
m] 
o 
E 
oO 


oO 
Oo 
Oo 
o 
o 
o 
(m 
(m 
a 


E 


mi 
[m] 
[m] 
go 
o 
Oo 
[m] 
o 
o 
o 
o 
O 
o 
Oo 
mi 
o 
o 


Juxury-loving 
mannerly 
76 
masculine 
17 
mature 
78 
methodical 
79 
mild 
80 
moderate 


61 


modest 
82 


nervous 
83 
noisy 
84 
obnoxious 
85 
organized 
86 
outgoing 
87 
out-of-doors 
88 
outspoken 
89 
painstaking 
90 
atient 
Pa 
peaceable 
92 
persevering 
93 
planful 
94 
recise 
Pos 
progressive 
96 
rational 


7 


reckless 
98 


refined 
99 


resentful 
100 


reserved 
101 

restless 
102 


robust 
103 


rough 
104 


rude 
105 


self-centered 
106 


self-controlled 
107 


selfish 
108 


sensitive 
109 
sentimental 
110 


serious 
1u 


mi 
oO 
Oo 
oOo 
Oo 
Oo 
oO 
(m) 
o 
o 
(m 
o 
m 
g 
o 
o 
oO 
oO 
| 
oO 
Oo 
oO 
| 
oO 


E 


DOoRea eo ow 


o 


shallow 
112 


sharp-witted 
113 
shrewd 
1s 
shy 
15 
simple 
116 
sincere 
ur 


slow 
118 


soft-hearted 
119 
spontaneous 
120 
steady 
121 
stolid 
122 
straightforward 
123 
strong 
124 
submissive 
125 
suspicious 
126 
sympathetic 
127 
tactful 
128 
temperamental 
129 
tense 


130 
thorough 
131 


thoughtful 
132 

thoughtless 
133 

timid 
134 


tolerant 
135 


tough 
136 
trusting 
137 


unaffected 
138 


unambitious 
139 

understanding 
140 


unemotional 
141 


unkind 
142 


versatile 
143 


vigorous 
144 
virile 
145 
warm 
146 


weak 
147 


worried 
148 


hich masculinity-femininity score is derived. 


329 


330 


46, the most feminine score, to minus 15, the most 
asculi f 61. 
uline score, or a total range o: Sok 
po the development of the scale, all the adjective 
checklists for all of the groups were scored. 


Scores on the Scale 


Mean scores and standard deviations for 
the various groups on the self-descriptions 
are shown in Table 1. 

For the groups upon which the scale was 
developed, the mean score for the 200 SLA 
women was 16.7 as compared to the means 
for the male groups of 7.9 for SLA men and 
7.7 for IT men. Thus, the means for the 
men and women are more than a standard 
deviation apart. The mean of the 100 non- 
standardization SLA men was 10.7 and for the 
100 nonstandardization IT men, 8.3, thus 
showing some shrinkage when the scale was 
applied to groups upon which it had not been 
developed. A group tested in 1957, to be 
described later, provides further information 
about such shrinkage for both sexes, 

The scores of the women ranged from 0 
to 31, the scores of the men from minus 6 to 
28. The median women’s score was 17 and 
of the 400 men, only 22, or 5%, obtained 
scores above 17. The median score for the 
men was 8, and only 7% 
ceived scores below 8. 

Analysis of variance was used to determine 
the significance of differences betw 
groups. Among the groups of me 
ferences were found to be statisti 
cant at the .001 level, 
tion SLA group received 
than did the SLA stand 


of the women re- 


een various 
n, three dif- 
cally signifi- 
The nonstandardiza- 
more feminine scores 
ardization group, and 


Table 1 
Means and Standard Deviations ¢ 


of Self-Description 
Femininity Score for Gro 


ups ‘Tested 


1955 SLA standardization men 100 


79 55 
1955 IT standardization men 100 it 53 
1955 SLA nonstandardization men 100 10.7 5.2 
1955 IT nonstandardization men 100 83 52 
1955 SLA women 200 167 65 
1957 SLA men 205 91 54 
1957 IT men 246 š $2 
1957 SLA women 193 15.6 6.1 
1957 Homosexual men 43 189 58 


Ralph F. Berdie 


the nonstandardization SLA group obtained 
more feminine scores than both the stand- 
ardization and nonstandardization IT groups. 
Variances were homogeneous. 


Reliability of the Femininity Scale 


The self-description adjective check lists for 
the 200 nonstandardization IT and SLA men 
were scored to provide split-half scores. on 
half consisted of the weighted items — 
in Columns 1 and 3 and the other was base 
on items checked in Columns 2 and 4. The 
correlation between scores for these two m 
for the 200 men was .45. The same a 
adjective check lists were scored for odd ae 
even items, and this provided a correlation a 
49. In December 1955, 95 men entering ee 
College of Science, Literature, and the a 
completed the adjective check list and si 
day later completed the blank again. Th 
test-retest correlation was .81. 

The two low correlations suggest es 
about the nature of the adjective check sA 
than a lack of reliability. The items that di ; 
ferentiate men and women may be ange 
or inconsistent, but low internal reliability, @ 
compared to the higher test-retest reliability: 
Suggests that many of these items are not ™ 3 
lated one to another. A low odd-even i 
bility well might be obtained if one use 
instead of adjectives such things as length is 
hair on the head, tendency toward hae 
height, and ratio of weight and chest me 
ments. These are all indices that neonate 
reliably between the sexes but which tend s 
have very little correlation among 'themselva 
within a given sex, In the adjective check t < 
30% of men and 60% of women checked a 
adjective “cool,” and 44% of men and 3077 
of women checked the adjective “agate 
Both of these adjectives discriminate iene 
sexes, but either among men or among bs 
the correlation between these two items on 
well be zero. A relatively low odd-even © pa 
relation suggests that many kinds of pekat 
iors discriminate between men and wom ; 
and that these behaviors are not all relate 


š r žaj ices 
Relationships between Femininity Indic 


aaple 
For most Ss, additional data were aa - 
and the relationships between adjective ch 


e 
list scores and other data were observed. 


f 


A Femininity Adjective Check List 


331 


Table 2 


Correlations between Femininity Indices for Groups of SLA and IT Nonstandardization Men 


1 2 3 4 5 6 7 
1. MMPI — 42 H -20 ld 02 13 
2. SVIB —.31 —.40 =43 —.05 —.13 —.01 
3. Self .20 53 39 -21 33 
4. Ideal 30 03 47 50 AT 
5. Average 18 Sl 37 37 39 
6. Father 19 71 59 .69 43 
7. Mother 14 49 49 -48 57 
Note.—Correlations M upper right sexment of table for 81-100 SLA men, in lower left segment for 76-100 IT men. 


Correlations are presented in Table 2, the 
Upper right segment of figures pertaining to 
the nonstandardization SLA men, the lower 
left Sroup of figures to the nonstandardization 
men. In this table, a correlation of .28 is 
Statistically significant at the .01 level, .22 
at the .05 level. 
For the SLA men the self-description cor- 
relates significantly with Mf score on the 
Mnesota Multiphasic Personality Inventory 
and with the masculinity score on the Strong 
cational Interest Blank. Only the latter 
Correlation is significant for the IT men. The 
One correlation that shows a large difference 
tween the two groups is that between the 
Score based on self-description and the de- 
Scription of father, where the correlation for 
he IT men is .71, for the SLA men, .21. In 
Spite of the great statistical significance of 
1S difference, a repeat of this analysis two 
Years later provided correlations with no sig- 
nificant difference. These correlations did 
“Uggest, however, that the perception of father 
n terms of masculinity-femininity is some- 
What different for engineering freshmen than 
° Arts College freshmen. 
, The scores for self and for ideal are rather 
ey correlated for both groups, with fhe 
“average correlation being somewhat lower. 
ese people describe themselves as being 
Ore similar to their ideal than they do to 
a Cir perception of the average person of their 
Se and sex, 
va © two correlational matrices for the seven 
“tiables were compared by transforming 
t , 7 M 
he zg toz (z= loge = = and comparing 


© differences between 2’s. The observed £ 


was not significantly different from zero, 
p = .56, thus supporting the assumption that 
the two sets of correlations were similar. The 
matrix for IT men then was factor analyzed, 
using Thurstone’s (1947) centroid method. 
The factor analysis was iterated once in order 
to stabilize the communality estimate. Three 
factors were extracted. These three factors 
were rotated obliquely and the loadings on 
the rotated reference vectors were: 


Reference vector 


Variable A B G 
1 02 55 21 
2 04 =53 —01 
3 69 18 —31 
4 60 31 —04 
5 68 09 13 
ó 89 —05 —02 
7 62 01 01 


Correlations between the primary vectors 
were: 


A&B.28 A&C07 B&C—.15 


The test indices thus form one cluster, the 
adjective indices another. A factor analysis 
of the five adjective indices alone provided 
two factors which correlated .60. Self and 
ideal tended to cluster, the others did not. 


Application of the Scale to Another Group 


A group of entering freshmen similar to 
those studied in 1955 was given the check 
list in 1957. This was done in order to repli- 
cate certain portions of the study, to obtain 
information concerning additional variables, 
and to obtain a further idea of the shrinkage 
using groups other than the ones upon which 


332 


the scale was standardized. The means and 
standard deviations for the three 1957 groups 
are also presented in Table 1. There it will 
be seen that the mean score for 1957 SLA 
women was about one point lower than the 
mean for the group upon which the scale had 
been standardized. Thus, relatively little 
shrinkage was found. The mean of the 1957 
SLA men was between the means of the two 
SLA groups in the earlier year, and the same 
was true for the 1957 IT group. These 
means, obtained for the various groups dur- 
ing the two years, and the figures pertaining 
to overlap indicate that this is a reasonably 
valid means for assessing the psychological 
differences between men and women of this 
age. The difference between 1957 SLA men 
and 1957 SLA women provided a critical 
ratio of 11.22, significant at the level of .001. 

This sample provided an Opportunity for 
us to study further the relationships between 
“self” scores and other scores on the adjective 
check list and to give particular attention to 
the correlations which in the 19 
distinguished between the SLA and 
Table 3 summarizes these correlations, 

The “self” and “ideal” correlations were 
relatively constant. However, the large dif- 
ference in correlations between self-descrip- 
tion and description of father for the two 
college groups disappeared. In the 1955 


samples, these two correlations were .21 and 
-71; in the 1957 samples, .45 and .42, Thus, 


what was a Statistically significant difference 
upon replication was reversed in direction and 
no longer significant, 


55 sample 
IT groups. 


Table 3 


Correlations between Selected Femininity Indices for 
Groups of SLA and IT Nonstandardization Men 
in 1955 and SLA and IT Men in 1957 


SLA IT 

Variables Ss Eo 
Correlated 1955 1957 1955 1957 
ii Se N y 

N 100 85 100 157 
Self and ideal Som Sger -63** Soe 
Self and father aI am 71** 42e 
Self and mother Bat | AG AV*® 3044 
a i amie si 


* Significant between .05 and .01. 
** Significant beyond .01. 


Ralph F. Berdie 


The two correlations for the SLA groups 
were different at a level significant at .07. 
The difference between the two IT gen 
tions was significant beyond .01. Thus, i © 
difference between the two original beget z 
tions, .21 and .71, changed because of z R 
appear to be significant changes in Tr 
groups with the greatest change in the 
group. 


Scores of Homosexual Men 


Masculinity-femininity indices have e 
validated not only on the basis of diferen 
between men and women but also by THE 
trasting homosexual men with others. de 
cooperation of Evelyn Hooker (1957) ane 
Possible the collection of data from a on ‘4 
of 43 homosexual males in California. T ho 
were men who were not institutionalized, W al 
were tending to make adequate pocupam 
adjustments, and who had participated as 
in other experiments. i 

The median age of this group was 33 Ay te 
with a Q, of 28 anda Qs of 45 years. ae 
group, 88% were high school graduates, 6 ble 
had gone beyond high school, 33% were nt 
lege graduates, and 5% had gone beyond rm 
lege. Thus, the homosexual group gie 
from the other male groups for whom P 
were available on the basis of age, educato a 
and geography, as well as on the age 
sexual behavior and preferences. The di ny 
ence in mean scores might result from a A 
single difference or combination of these 
ferences. ë 

The mean femininity score for the wee 
sexual group, as shown in Table 1, was 1 of 
—higher than the means for the two oe 
of college women and approximately the 
Standard deviations above the means for jå 
college men. Of the homosexual group, E 
or 33%, obtained scores below 17, the ja 
dian of the college women, and only 1 or for 
obtained scores below eight, the median 
college men. 

Others have suggested a relationship peier 
age and psychological femininity (Sere ia 
1943), but the available data suggests that 
relationship is not sufficiently high sO at- 
the difference obtained here should be 


f 0" 
tributed to the greater age of the hor 
sexual group. 


> 


A Femininity Adjective Check List 


The correlations between the self-descrip- 
tions and indices based on the other adjective 
checklists were compared for the homosexual 
group and the IT and SLA men. These cor- 
relations were: 


Self and Self and Self and Self and 


Ideal Average Father Mother 
Homosexual AT 23 31 26 
SLA 53 39 2 33 
1g 63 51 7 49 


Both college groups described themselves as 
being more similar to their ideal and to their 
Perception of the average person of their age 
and sex, No such consistent trends were 
found for parental descriptions. 


Summary and Conclusions 


An adjective check list scale was developed 
to provide an easily obtainable index of psy- 
chological masculinity-femininity. The de- 
"ed scale was based on 61 items included in 
aà list of 148 adjectives. Only a minute or 
two is required to check the list by most 

S. The index substantially distinguishes be- 
tween groups of male and female college 
freshmen, and between a group of homo- 
Sexual men and male college freshmen. The 
nonunitary character of the scale is revealed 

Y low intrascale correlations. The higher 
test-retest reliability and the higher inter- 
Scale correlations suggest that the index is 
reliable enough for the kinds of group re- 


333 


search for which it was developed. The scale 
is not presented as an instrument to be used 
for purposes of individual diagnosis. 

Correlations between this index and mas- 
culinity scores on the Strong Blank and the 
MMPI are in the expected direction and are 
statistically significant, but the order of these 
correlations suggests the variables measured 
by the three instruments are not the same. It 
is hoped that the adjective check list will in- 
crease our understanding of the other two 
well established instruments and further our 
knowledge of the psychological meaning of 
the masculinity-femininity variable. 


Received December 17, 1958. 


References 


A theory of vocational interests as 


Bordin, E. S. 
Educ. psychol. Measmt., 


dynamic phenomena. 
1943, 3, 49-65. 

Darley, J. G., & Hagenah, T. 
measurement: Theory and practice. 
Univer. Minnesota Press, 1955. 

Gough, H. G. Reference handbook for the Gough 
Adjective Checklist. Berkeley: Univer. California 
Inst. of Pers. Assessment Res., 1955. 

Hooker, E. The adjustment of the male overt 
homosexual. J. proj. Tech., 1957, 21, 18-31. 

Strong, E. K., Jr. Vocational interests of men and 
women. Stanford Univer. Press, 1943. 

Super, D. E. A theory of vocational development. 
Amer. Psychologist, 1953, 8, 185-190. 

Terman, L. M., & Miles, C. C. Sex and tempera- 
ment, New York: McGraw-Hill, 1936. 
Thurstone, L. L. Multiple factor analysis. 

cago: Univer. Chicago Press, 1947. 


Vocational interest 
Minneapolis: 


Chi- 


Applied Psychology 
aes alte 1959 


PREDICTION OF CONSUMER PURCHASE AND THE 
UTILITY OF MONEY: 


LYLE V. JONES 


University of North Carolina 


The model for prediction of consumer choice 
presented by Thurstone (1945; 1951) Tepre- 
sents a significant advance toward the applica- 
tion of subjective measurement theory. Using 
that model it becomes possible to predict, from 
results of psychological scaling analysis, aspects 
of actual behavior of a group of consumers. 
The model has obvious relevance to problems 
of marketing and prediction of voting, to name 
but two problem areas. 

While Thurstone presented prediction pro- 
cedures in a nonparametric form, they may be 
easily extended in terms of the parametric 
preference model provided by the method of 
successive intervals (Adams & Messick, 1958). 
Such an extension has been derived by Bock 

(1956). In this extended form, the model 
allows consideration not only of differential 
preference for a group of competing consumer 
goods, but also inclusion of indices of the popu- 
larity of the prices at which each of the con- 
sumer items is offered (Jones, 1956). 


The Model 


Prediction of Choice 


In accordance w 
sume that the hypo 


the ath respondent 
written 


ith the scaling model, as- 
thetical preference score of 
for the ith item may be 


Xia = Vit eia C] 
mean preference 


and ¢ andom component 
attributable to individual differences in prefer- 


ence, distributed normally with zero mean and 
variance of ø? in the population of Ss. Assum- 
ing preference scores for any pair of consumer 
items to be uncorrelated, the normalized differ- 


1 This paper reports research sponsored, in part, by 
the Quartermaster Food and Container Institute for 
the Armed Forces. The views or conclusions contained 
in this report are those of the author, They are not to 
be construed as necessarily reflecting the views or in- 
dorsements of the Department of Defense. 


indivi reference for @ 
ence between an individual’s preference 
Pair of items, i and j, is 


(Xia = Y;) an (Xje = Y;) 
Vo? + o’ 


: TEME re ther, 
Zija has the unit normal distribution. le 
the joint distribution of zija, Zika ÍS ies 
normal with mean of zero, variance o 


Š a 56). 
and correlation coefficient of 0,2 (Bock, 195 ) 
That is 


(2) 


Zija 


r 3 
S Giza, Zika) = N (0, 0, 1, 1, 07) [ J 
Rect als ma; 
From the method of successive intervals gE 7 
be obtained estimates of Vi, Vy, Fiy Oe aie 
and o,? for any three consumer items. 


arat, ^ 
ing those estimates (distinguished by a carat, ) 
define 


_Î;-ĵÎ: (4) 


Vee + oF 


= 
and 


[5] 


a Id 
Then the probability that an individual be 
choose Object i over both competing Obj 
j and k is given by 


PESINR) 


j f LHe Zina) daijad Bika [6] 
cji Cki 


f 

This integral may be evaluated with er 

tables for determining the volume of quads in 

of the bivariate normal distribution funct re- 

(Pearson, 1931). In the general case of a 

icting choice of one from 1 competing ob ble 
evaluation of the multiple integral is eri 
by reduction methods (e.g., Plackett, 1954)- 


Prediction of Purchase 


++ om5 

Assume that the competing consumer rece 

are differentially priced. Then the protect 
Score of the ath individual for the ith 0 


334 


Bi aie 
= eee 
OO 


Prediction of Consumer Purchase 


offered at Price b may be written 


Nine = [7] 
where F; is the population mean preference 
Score for Item i, U, is the population mean 
utility of Price p, and eia, as before, is a random 
Component distributed as N (0, o). 
Analogously to Equation [2], define 


Fit Uy + Gia 


Zin ina 
Xiva (Vit Up) J-LX ne Vj +U I 
[8] 


Under the previous assumptions, and analo- 
gous to [3] 


S Gp. ia a5 Zip, kr, a) = N(O, 0, 1, a oi) [9] 


To express the probability that an individual 
Would purchase Object i at Price p rather than 

bject j at Price q or Object k at Price r, we 
again evaluate an integral of the form of [6], 


Pip > JaN kr) 
z f f(inia aj Zip, kr, a) 
Cia ip d ckr, ip 
X deip.jnadZip wna [10] 
Where the lower limits of integration are 


C+ 0) = i+ 0) i 


Cia ip = ae 
and VG? + 6; 
ope OM = BoE rye 


Ver + oe 
However, in this case we lack empirical esti- 
ates of While the method of 
~ Ccessi imates F; 
and ĝe 
i 


all parameters. 
ve intervals provides the est 
» the Ê values remain unknown. 

er objects and 


Fi 
Ath three ¢ : ; 

; competing consum! 
ee f purchase of 


pirical] known proportions © 
Tae there are ae L ANONS of the form of 
pation [10], one yielding P cip>sankr)> omnes 
p" inne», and the third, (not indepencen’/, 
bivggdeniny- Using these values, tables at the 
ite a normal distribution function @ by 
Ntilit, 'Ve solutions for the three eptiztaten of 
“Vailal, U,, O,, and U,. Finally, if data a 
ton thle for several sets of three competing 


Sumer objects, each set containing one ob- 


335 


ject at Price p, one at Price q, and one at Price 
r, then several estimates may be found and 
checked for consistency, for each utility of 
price. 

The Application 


For this study competing entrees on a 
luncheon menu serve as stimuli. A seven- 
category successive category rating scale was 
mailed to each of the 430 faculty members who 
were also active members of the faculty club at 
the University of Chicago. The addressee was 
instructed to complete the form by placing a 
check mark to indicate the degree to which he 
liked or disliked each menuitem. Included on 
the schedule were the names of the 15 entrees 
served at the club during a criterion period. A 
total of 297 completed forms were returned, 
comprising 69 per cent of those mailed. 

Five criterion days were selected. On no 
criterion day was there a shortage of a luncheon 
item at the club, and on each day more than 
100 members patronized the regular dining 
room facilities. The frequencies of purchase 
of the three competing luncheon entrees on 
each of the five days serve as criteria. 

From the preference ratings, approximate 
least squares successive intervals estimates for 
scale values and discriminal dispersions were 
obtained by a graphical method (Jones & 
Thurstone, 1955). Based upon these prefer- 
ence parameters, and upon the assumption of 
the normality of distributions of preference 
along the underlying scale continuum, one may 
utilize Equation [6] to predict the proportion 
of consumers who would select each of three 
competing consumer objects. The resulting 
predicted proportions appear in Table 1, 
Column A. A comparison of these predicted 
proportions with actual observed proportions 
of choice indicates that discrepancies are con- 
siderable. The average error in predicting 
proportions is 194. : 

The relatively poor fit of predicted to ob- 
served proportions may be partially attribu- 
table to the differing prices at which entrees 
were sold. On each criterion day, one of the 
three entrees was offered at $1.20, one at be- 
tween $.95 and $1.05, and the third at between 
$.80 and $.90. For convenience each of the 
three price levels is considered homogencous, 
best represented by the prices $1.20, $1.00, 


and $.85. 


336 


The Menu Items, Their Prices, 


Lyle V. Jones 


Table 1 


and Observed and Predicted Proportions of P 


urchase for Each 


(N is the number of luncheon patrons, and serves as the base for the observed proportion) 


Proportion Predicted Proportions 
Observed ene aera | 
N Entree Price Choice A 
- s 7 402 
116 Roast round of beef $1.20 405 707 p 
Smoked tongue 1.00 319 120 p 
Creamed mushrooms on toast 85 276 173 i 
Š 236 
107 Fried chicken leg with country gravy $1.20 215 510 m 
Meat loaf with brown gravy 1.05 505 273 “ao! 
Welsh rarebit on toast .80 .280 217 “ 
26 
123 Roast leg of lamb $1.20 268 623 os 
Smoked Thuringer sausage 95 342 -200 378 
French fried smelts with tartar sauce 90 390 177 n 
z 51 
102 Roast leg of lamb $1.20 441 651 oe 
Braised ox joints 1.00 304 152 “335 
Baked beans 80 255 197 r 
139 Roast round of beef $1.20 295 586 a 
Creamed chicken with hot biscuit 1.00 .439 .244 % 66 
Apple fritters, bacon, and syrup 185 266 170 2 
Mean dipred-obs) 104 031 


Using the prediction of purch 
Specified by Equation [10] 
solutions for the utilities of 
levels were obtained for each 
three equations, Each set of t 
Provides a unique estimate for U s5, Us.00, and 
U1.2 on a scale with an arbitrary zero, The 
obtained estimates appear in Table 2. It will 
be noted that the most divergent values are 


ase model 
above, iterative 
the three price 
of five sets of 
hree equations 


$.85 $1.00 $1.20 


Price 


Tic. 1. Final estimates of utility of the three prices. 


st 
those for Criterion Day 3. The lowest CO 
ay is french fried smelt. jeda 
day was a Friday appears to have ade we 
determinant of purchase which is not inclu¢ 


entree on that d 


the 


in the model, 


f , A the 
It is also of Interest to examine a plot p ter 

mean utilities from Table 2 (Fig. 1) to ce for 

mine relative Strength of negative utility 


the three prices, 


are consistently 


Table 2 


Estimates of Utility Values 
(Uss is arbit 


Criterion 

Day U 1.20 V3.0 
1 — .810 420 
2 = .967 2605 
3 — 1.395 —.257 
4 = 54 353 
5 — 916 154 

Mean — 927 .187 


That 


4.20 
The values of Uss and i oo" 
. P 1- 
more negative than 


rarily assigned zero utility) 


Prediction of Consumer Purchase 


While $1.20 is the least preferred price, $1.00 
sa price preferred to $.85. In other words, 
in this study, utility of price is not monotoni- 
cally related to price. 

Utilizing the mean values for the three 
Utilities, final predictions are made, the results 
of which appear in Column B, Table 1. The 
Improvement of fit is demonstrated by the 
relatively small average discrepancy of pre- 
dicted from observed proportions, .031, and 
lends credence to the model. 

The finding that faculty members, when 
lunching at the faculty club, prefer paying 
$1.00 to paying §.85 may come as a surprise. 
However, we might conjecture that the social 
Psychology of publicly ordering lunch at a 
table with colleagues provides a disposition 
away from the cheapest meal. The present 
Study, of course, provides no evidence as to the 
Source of the finding. Nor may we legiti- 
Mately generalize the findings to any other 
Situations, Nevertheless, it might not be sur- 
Prising to find such nonmonotonic relations 

tween price and utility of price for numerous 
Consumer commodities: for cosmetics, articles 
of clothing, household drug supplies—indeed, 
Or any items where the consumer evaluation 
of quality is difficult or impossible to make 
'ndependent of price. 
, The method of measuring utility of money 
il Ustrated in this study can be characterized 
as involving two important restrictions: (a) the 
Concept measured is one of group utility rather 
than individual utility ; (b) the method involves 
ferred rather than direct measurement, 1n 

'S case measurement inferred from a prefer- 


337 


ence model and from proportions of choice of 
alternative items. With respect to the first 
restriction, it should be recognized that similar 
methods might be adapted to the study of in- 
dividual utility. A population could be de- 
fined by the response repertory of a single in- 
dividual. A study of economic indifference 
functions by Thurstone (1931) illustrates this 
point. As for the second restriction, it empha- 
sizes the need for cross-validation, i.e., for 
evaluation of obtained estimates via prediction 
of results for experimentally independent data. 


REFERENCES 


Apams, E., & Messick, S. An axiomatic formulation 
and generalization of successive intervals scaling, 
Psychometrika, 1958, 23, 355-368. 

Bock, R. D. A generalization of the law of compara- 
tive judgment applied to a problem in prediction of 


choice. Amer. Psychologist, 1956, 11, 443. (Ab- 
stract) 
Jones, L.V. Prediction of consumer purchase. Amer. 


Psychologist, 1956, 11, 443. (Abstract) 

Jones, L. V., & THURSTONE, L. L. The psychophysics 
of semantics: An experimental investigation. J. 
appl. Psychol., 1955, 39, 31-36. 

Pearson, K. Tables for statisticians and biometricians. 
Part II. London: Biometric Lab., University 
College, 1931. 

PLACKETT, R. L. 
multivariate integrals. 
360. 

THURSTONE, L. L. The indifference function. 
Psychol., 1931, 2, 139-167. 

THURSTONE, L. L. The prediction of choice. Psycho- 
metrika, 1945, 10, 237-253. 

Tuurstone, L. L. An experiment in the prediction of 
choice. Univer. of Chicago, Psychómetric Lab. Res. 
Rep., 1951, No. 68. 


A reduction formula for normal 
Biometrika, 1954, 41, 351- 


J. soc. 


al of Applied Psychology 
eae, No, 3, 1959 


NAGERS, CLERKS, 
CATEGORIES OF THOUGHT OF MA ; 
AND WORKERS ABOUT JOBS AND PEOPLE 

IN AN INDUSTRY ' 


HARRY C. TRIANDIS 2 


Cornell University 


Recent work in perception (Hayek, 1952) 
and thinking (Bruner, Goodnow, & Austin, 
1956) has used categorization as one of the 
main units of analysis. Kelly (1955) has 
stressed the usefulness of the knowledge of 
the patient’s categories of thought, or Personal 
constructs, in clinical therapy. Triandis 
(1958) has shown that the more similar the 
categories of thought employed by two people, 
the more likely it is that they will communi- 
cate and the greater the likelihood that they 
will like each other, 


The present report describes a method for 
obtaining the categories of thought of Ss, pre- 
sents lists of these Categories obtained from 
various groups in industry, and attempts to 
assess the significance of these differences. It 


concentrates on only two Cognitive domains: 
jobs and people. 


Method 
Procedure 


Triads of jobs (or people) were Presented to the 
Ss, who were asked “Which one of these three jobs 
(or people) is more different from the other two?” 
and “Why?” The adjectives or 
the jobs (or People) which were 
way, together with their opposites, 
the Ss, are the categories of thought 
categories were obtained in group sess 
5 to 15 Ss, of the same status, 
sion. All Ss were given 12 tria 
triads of people. The triads we 
following jobs: the Present job, 
job that S would be likely to 
not have his present job, a job § 
a job S hopes to have some day, 
siders very useful, a job that w 
happy, a job S considers very i 

best paying job that S thinks he 


characteristics of 
obtained in this 
also supplied by 
of the Ss. The 
ions with about 
attending each ses- 
ds of jobs and 12 
te formed with the 
a previous job, the 
be doing if he did 
would like to have, 
a job that 5 con- 
ould make § very 
nteresting, and the 
will have some day. 


1 This paper is based on portions of the writer’s 
doctoral dissertation. The author gratefully ac- 
knowledges the guidance and help of W. W. Lam. 
bert, T. A. Ryan, and W. F. Whyte. The larger 
study, of which this is a part, was supported by a 
grant from the Foundation for Research on Human 
Behavior. j S 

2 Now at the University of Illinois. 


338 


The person triads were formed with the falen 
people: the self, the S's supervisor, a Pfu fellow 
by S, a hardworking fellow worker, a skillfu fel- 
worker, a fellow worker devoted to his job, sho is 
low worker for whom S feels Sorry; @ person Mores 
well known throughout the company, and a su! repe- 
ful person who is Personally known by S. No cople 
titions were allowed in the choice of jobs and pe 
to fit the above designations, or in the B 
that were elicited from each triad. The Kelly ( Test 
group presentation format of the Reparto oat 
Was used. The test sessions were of a 2-hr. eres 
Two sophomores, acting as clerical assis aaa 
classified the categories obtained from each Pn 
of Ss as follows: A protocol was picked at nam pi 
and all the categories were recorded. Each 0 cols 
categories of the second and all subsequent proto the 
were judged either similar or different are to 
categories already recorded. If they were jude are 
be similar (say, good-bad and pontian E 
similar), a checkmark was entered next to a be 
ready existing category, If they were judged te a 
different, they were recorded so as to cons i 
new class of categories, Categories with more 
one entry were called category-classes. n re- 
The category-classes so constituted were the ory- 
arranged in “logical fashion” and formed EORR 
groups; for instance, the category-classes plenna ae 
good-bad, beautiful-ugly, etc. formed the “eva 
tive” category-group, 


Location 


rk 
The study was conducted in a rural New hy 
State community of about 2000 people, whic die 
shall call Treeville. The company which was ae 
employs about 300 People. It manufactures FE 
motive equipment requiring skillful gaerig ead- 
production and markets it nationally. The hea 
quarters are jn Treeville, 


Subjects 


Ower management, the pe 
Workers is predominately tak- 
Many workers were unsuccessful farmers before inte 
ing a job with the company, Only the highly $ the 
ad been industrial Workers all their lives. OF an 
other hand, the background of the male clerks 


—— 
z PEA 
n a 


Categories of Thought About Jobs and People in Industry 


the upper managers is predominately urban. The 
average age of the workers is much higher than that 
of the other groups (32% were older than 51 years; 
only 13% of the top managers were older than 50). 
The average years of residence in the vicinity of 
Treeville was much higher for the workers (20 
years) than for any of the other groups. 


Results 


The complete lists of the categories that 
Were obtained are available elsewhere (Tri- 
andis, 1958, pp. 124-134). Separate lists are 
available for the upper managers, the lower 
and middle managers, the female clerks, the 
Male clerks, and the workers. Most of these 
lists contain more than 200 categories, ar- 
ranged in more than 100 category-classes. In 
Order to reduce the number of category- 
Classes in each list, the number of categories 
in each category-class was considered. Since 
the number of Ss, the number of categories 
obtained, and the number of category-classes 
vary from group to group, it was considered 
desirable to use a cutting point that will con- 
Sider all three variables. Such a criterion was 
obtained by means of an adaptation of the 
Chi-square test.* The criterion is equivalent 
Tom group to group because it takes into 
Consideration the three variables mentioned 
above, 

Tables 1 and 2 present the most frequently 
“sed category-classes for people and jobs. — 
m able 3 presents the category groups with 

€ highest frequency of entry. 
aie quency 

If N Ss gave m categories of thought and these 
categories were classified in » category-classes xt 
a arse, n< m), then if there is an equal chance that 
a Biven category of thought will be classified in Ls 

© n category-classes, there will be m/n th : 
mows Of thought in each category-class. Fur e 
t ore, with » category-classes we can do  chi-squa’ re 
lat Y comparing the frequencies in cach ear 

S with the frequencies in all the othe: y- 
gaes Sonem The level of significance of out 
ti “square tests should be .05/# (this is a Sees. 
es Way of taking care of the multiple tests). iant 
i is the value of the chi square that is sigmiic 

that level, we will have 


Borie 


(g mm! y 
-n m/n 


Ww 
mont is the observed ited 
oss. If we solve for a 
art gory-classes “that have frequencies that Caen 
Ced e retained in our summary lists. EA 
kaes retains only the most frequent cae y 
Cat, Ses, considers the number of Ss, the nun hrpi 
S cress and the number of classes and PEME 
erion that is equivalent from group to g - 


uency in a given cate- 
we can state that any 


339 


Discussion 
The Broad Categories (Category-Groups) 


Table 3 suggests that when members of 
lower management perceive people, in the in- 
dustrial situation, they tend to use power 
categories—is this fellow a supervisor? Does 
he have authority? Is he an executive? 
These are the important dimensions for this 
group. Our interpretation is that the transi- 
tiveness of the lower management’s position 
in the hierarchy of the organization makes 
power an important broad category, whereas 
for those who are at the bottom, and for those 
who “have arrived,” this category is not as 
important. 

Evaluation and, to a lesser extent, activity 
seem to be fundamental categories of perceiv- 
ing people, since they are used by all the 
groups significantly frequently. In other 
words, when the Ss perceive another person, 
they tend to evaluate him—how good, fair, 
capable, cooperative, nice person, well liked, 
sociable, etc. is he? They are also concerned 
with his rate of activity; slow-fast, active- 
passive. Within the broad category of evalua- 
tion, however, there are differences between 
the groups. The upper managers emphasize 
background—college trained, educated, pro- 
fessional, good background—while the work- 
ers emphasize dependability—careful, safe to 
work with, thorough. The latter finding is to 
be expected in a shift shop much more than 
in other work situations; sloppy work of one 
shift may create problems in the next shift. 
Since the company’s factory operates on a 
two-shift basis, we would expect to get this 
dimension stressed in this factory, while in 
other factories some other aspect of evalua- 
tion may be more important. 

When perceiving jobs, job characteristics 
and job requirements seem to determine the 
field. Whether the job is buying or selling, 
production or planning, rare or common, in- 
volves personal contacts or no personal con- 
tacts, is seasonal or steady, etc. seems impor- 
tant. Among the requirements, intelligence, 
creativeness, ability to deal with details, edu- 
cation, etc. are important. All groups, except 
the workers, are concerned with the “power 
aspects of the job—how much authority. The 
workers are concerned with whether the job 


340 


Harry C. Triandis 


Table 1 


" 5 Awd l 
Category-Classes with the Highest Frequencies of Entries for All Groups of Subjects. Domain: People 


Upper Management Workers 
Introvert-extrovert* 
Gracious-crude* 
Experienced-inexperienced* 
Educated-uneducated* 
Tall-short* 

Old-young* 


Slow-fast* 

Gets high pay-gets low pay* 
Stable-unstable* 

Quiet-loud 
Intelligent-unintelligent* 
Skilled-unskilled* 
Worker-manager* 


Other Management 
Supervisor-nonsupervisor* 
Intelligent-unintelligent* 
Introvert-extrovert* 


Worker-manager* 


Office-factory* 
Experienced-inexperienced* 


Female Clerks 


Ambitious-unambitious* 
Introvert-extrovert* 
Young-old* 
Executive-nonexecutive* 


Intelligent-unintelligent*** 
Nice-poor personality* 
Dependable-undependable* 
Quiet-loud* 


Friendly-unfriendly* 
Tall-short* 
Quiet-talkative* 


Male Clerks 
Old-young* 
Office-factory* 
Quiet-talkative* 


Introvert-extrovert*** 
Worker-manager* 
Experienced-inexperienced* 


Friendly-unfriendly* M 
Intelligent-unintelligent* 
Executive-nonexecutive 


Note. 


of importance, 


Category-Classes with the Highest Frequencie: 


—The number of asterisks indicate the importance of the category for the group. 


" in order 
The categories are listed in 


Table 2 


S of Entries for All Groups of Subjects. Domain: Jobs 


Upper Management 


. Workers 
High-low pay** High-low pay*** 
Much-little responsibility** Dirty-clean** 
Broad-specific duties** Skilled-unskilled* 
Executive-nonexecutive* Difficult-easy* 
Creative-uncreative* Requires much-little intelligence* 
Many-few personal contacts* 
Skilled-unskilled* Female Clerks 
Worker-manager* i i 

Int a) ing*** 
a Gas eresting-uninteresting 


Skilled-unskilled*** 
Office-factory** 

Many-few contacts with people** 
Employer-employee* 


Other Management 
Deals with details-generalities** 


š ; i Much-little res; onsibility** 
Skilled-unskilled* High-low pay* Planning-production* ý 

7 p as Fi s, * 

Executive-nonexecutive Supervisory-nonsupervisory* 


Male Clerks 
Requires much-little responsibility*** High-low pay** Dirty-clean* 
Mental-manual work* Employer-employee** 


Executive-nonexecutive* 


Note.—See footnote to Table 1. 


4 


Categories of Thought About Jobs and People in Industry 


Table 3 
Category-Groups with the Highest Frequency 
of Entry 
People 
Upper Management: Evaluation,*** Activity,* Back- 
ground* 


Lower Management: Evaluation,*** Power** 


Workers: Evaluation,*** Activity,** De- 
pendability* 
Clerks: Evaluation,*** Activity** 


Jobs 


Job characteristics,*** Job re- 


quirements,* Power* 4 
Job characteristics,*** Power, 


Job requirements* 

Job characteristics,** and Re- 
quirements,** and Evalua- 
tion** 

Job requirements,** and Charac- 
teristics,* and Power* 

Job characteristics,** and Re- 
quirements,* and Power* 


Upper Management: 
Lower Management: 


Workers: 


Female Clerks: 
Male Clerks: 


Note.—see footnote to Table 1. 


'S clean or dirty much more than any of the 
Other groups. 


The Categories of Various Groups: For People 


Turning now to more specific aspects of the 
w oblem, we observe that the female clerks, 
ic en they perceive people in industry, tend 
© Stress the dimensions which one might ex- 
ect are relevant to “eligible bachelors”: The 
Sst frequently used categories are intelli- 
Hid ambitious, nice personality, friendly, 
» dependable, young, etc. 
ati he reader is free to give his own interpre- 
lons to the lists presented in Tables 1 and 
e present writer’s interpretation of 
ists is as naive as possible. He sees 
ie lists as expressions of the concern of 
T about status, Status, however, 1S = 
ty, ed in different terms by each group. The 
os Managers use upper class criteria, such 
loy Staciousness, education, and polish. The 
managers make power distinctions 
r-manager, office-factory). The work- 
ake the more ordinary approach to status, 
lation? Money. This suggests, if wild coe 
Memb IS permissible, that rewards m n 
ership to the country club wou 


these | 


r 
(Worke 
erg t 
nam 


341 


most effective with upper management, while 
lower management may simply settle for a 
nice sounding title and the feeling that they 
have power over others. Finally, the workers 
would probably settle for just money. Of 
course, if they did get money they would want 
power, too, and if they did get power, they 
would want membership in the country club. 
This suggests a sort of hierarchy of rewards, 
parallel to Maslow’s (1954) hierarchy of 
motives (survival, safety, belongingness, es- 
teem, and self actualization). It is the writer’s 
impression that top management is very well 
paid in relation to the income of the average 
resident of Treeville. An income of $5000 
per year is considered “making good money” 
in town. The workers earn much less than 
that. On the other hand, the top managers 
earn at least twice and often three or four 
times this sum. Furthermore, their income 
exceeds that of even the wealthy residents of 
Treeville. In view of this difference, it seems 
reasonable that workers would consider 
“money he earns” as one of the dimensions 
of interpersonal perception in the factory, 
since this is a dimension of “relative depriva- 
tion.” The idea that “relative deprivation” 
may be a factor determining whether a dimen- 
sion will be used more often than other di- 
mensions by a particular group is strength- 
ened when we consider the lower management. 
This writer, in the course of interviewing the 
lower management, was left with the definite 
impression that this group was very much 
concerned with problems of authority. It 
seems that most decisions are still made at 
the top and the lower managers simply carry 
them out, often blindly. Whether or not a 
person has the authority to make decisions 
is crucial for lower management; it is a di- 
mension along which they are “relatively de- 
prived.” Our data show that authority, or 
power, is an important category of that group. 
On the other hand, their pay is around 
$5000, which is “good enough,” and so they 
do not use the dimension of pay in their 
interpersonal perception nearly as much as 
the workers. 

Another theme that runs through these 
categories is what Parsons and his associates 
(1953, 1955) call “instrumental versus ex- 


pressive.” Upper management and the clerks 


342 


are concerned more with expressive categories 
gracious, friendly, extroverted, talkative, 
many-few personal contacts), while lower 
management and the workers are more con- 
cerned with instrumental categories (intelli- 
gent, skilled, dependable, stable). 

Certain categories are used by all groups; 
for instance, friendly-unfriendly, intelligent- 
unintelligent, skilled-unskilled, experienced- 
inexperienced, introvert-extrovert, } old-young, 
cripple-fit,  quiet-talkative, excitable-calm, 
slow-fast, worker-manager, o ffice-factory. Some 
dimensions seem to be stressed almost exclu- 
sively by some groups; e.g., upper manage- 
ment stresses well-poorly groomed, educated, 
has university degree, gracious-crude, fast- 
slow thinker, staff-line, etc., lower manage- 
ment stresses supervisor-not Supervisor, the 
workers stress quality worker, serious worker, 
good-bad, careful-careless, stable-unstable, 
alert-sleepy, low-high pay. 

Certain categories are used by women more 
often than they are used by men. Young,* 
working with triads of community organiza- 
tions, found such differences too. He found, 
for instance, that women stressed the religious 
character of the organizations more often than 
the men; men stressed more often whether 
the organization is all male or mixed. Simi- 
larly, in our lists we found many strictly male 
dimensions, such as knows-does not know his 
job, slow-fast, dependent-inde pendent, experi- 
enced-inexperienced, has n 
and some female dimensions, such as honest- 
dishonest, polite-impolite, dependable-unde- 
bendable, married-single, etc. It seems that 
the male dimensions are somewhat more in- 
strumental and the female dimensions some- 
what more expressive, This is consistent with 
Parson’s (1955) theorizing. 

When we undertook this ri 
not expect to obtain any differences along the 
instrumental-expressive dimension. Since we 
did find some such differences, however, it 
may be worth exploring this aspect of our 
findings more than we have so far, One ques- 
tion of considerable interest is whether good 
leaders use both instrumental and expressive 
categories, as is consistent with the “great 
man” theory of leadership. Borgatta, Couch, 
and Bales (1954) seem to think so, and have 


#F. Young. Personal communication, 1957, 


tuch drive-is lazy, 


esearch we did 


Harry C. Triandis 


experimental evidence to support this v 
Turning to our data we find that the ia 
managers (who are presumably better lea 3 
than their subordinates) used a much ma 
balanced set of categories than any of their 
subordinates. This supports the “great ee 
theory of leadership. It also suggests a 
leadership training is, perhaps, a proc s 
whereby a person comes to see other peona 
in terms of both instrumental and expressi 
categories. 


The Categories for Jobs 


All groups, except the women, consider PY 
an important characteristic of the job. ch 
undoubtedly interpreted variously by “a, 
group (administrative, manual, typing, € ale 
is on all the lists except that of the ae 
clerks. The nature of the work (creative, rs 
teresting) and certain expressive characte A 
tics seem to concern the upper manageme 
and the female clerks. Status symbols, a 
as manual-mental, office-factory, dirty-clee 
appear in some lists. The workers seem mY 
concerned with job requirements, particu ine 
intelligence, and seem to have a sort of iid 
feriority complex” about intelligence. kers 
came out also in interviews where the wor du- 
often told me: “I don’t think I have the were 
cation to understand this,” “I wish I pout 
more intelligent,” “management knows 4 is 
this, I don’t,” etc. The final observation, 
that the two management groups use sher 
gories that are more similar to each © 
than to any of the other groups. e of 

In the case of jobs, just as in the ei the 
people, certain Categories were used by 4 difi- 
Sroups. These include indoors-outdoors, nays 
cult-easy, skilled-unskilled, gets high-low a 
requires much-little education, involves i if 
no responsibility, desirable-undesirable, J 
has no authority, employer-employee, Mn 
a lot-stays put, routine-variable, interest”? 
uninteresting, manual-mental. mo. 

Some of these categories are stressed sse 
by one group, however, than they are ar i- 
by another. Some categories are pre deals 
nantly managerial ( Planning-productio; = ipg, 
with theory-practice, requires good ie h 
policy making, has many-few contacts tails» 
people, creative, deals with generalities-de n 
sales-production), others are predomina 


re 


: 


| 


Categories of Thought About Jobs and People in Industry 


worker categories (careful-sloppy, requires ex- 
tensive training, dirty-clean, honest-dishonest, 
low-high type work, dressed up-in work 
clothes). 


General Discussion 


__ The reader must keep in mind the fact that 
if a certain characteristic is approximately 
equally distributed among the various ele- 
ments (people or jobs) involved in the triadic 
Comparisons, this characteristic will not ap- 
Pear in our list of Tables 1 and 2. This is 
Simply because of the nature of the triadic 
Procedure. Equal distribution of the charac- 
teristic implies that it can be completely ab- 
Sent, it can be present, or it can be present 
in large amounts; as long as it is equally 
Present, it will not appear. Thus, if you 
Compare, in the triadic procedure, three mil- 
‘onnaires, or three hobos, money will not be 
Siven as the characteristic along which these 
People are judged. When we find, then, that 
© managers do not consider “how much the 
man makes” as one of the important attri- 
butes of the man in industry, this does not 
imply that they do not consider this charac- 
teristic when they meet a millionnaire. 

What, if anything, is to be gained by ana- 
Yzing the categories of various groups? As 
as already been suggested, it may be shown 

at certain individuals, e.g., top management 
a ich implies good leaders), use more of 
pne type of category than another. But more 
portant, perhaps, is the question of “what 
Categories should a group use when trying to 
e municate with another?” Suppose, for 
ample, that management is preparing âà 
erit rating scale for use with workers. What 
Ategories should it use? 
Ver € analysis of one merit rating scale shows 
ry little overlap of the categories used by 
kor ers in our study and the categories used 
M that merit rating scale. It is an empirical 
Mestion whether workers judged on such a 
frit scale would find it just and fair. It is 
possible that a merit rating scale using some 
e categories that we obtained from our 
lic procedure would be more effective. 
erit rating has, among other functions, bie 
f °st important function of providing materia 
iscussions on how the employee may 1m- 
ve. One suspects that trying to convince 


trig, 


343 


a worker to improve his industriousness will 
be less effective than trying to convince him 
to improve his sill, or his chances of advance- 
ment. 

Examination of the job evaluation plan of 
the National Electrical Manufacturers Asso- 
ciation (Tiffin, 1952) shows that the cate- 
gories they have used overlap very much with 
the categories found in our research project. 
They used skill, effort, job conditions, etc. 
This plan, then, is probably more effective, 
from the point of view of communication, 
than the previously mentioned merit rating 
plan. We advance the hypothesis then that 
if “management uses the worker’s dimensions 
in its communication with the workers it will 
be more successful in its communication.” 
The evidence presented by Triandis (1958) 
supports this hypothesis. 

It must be kept in mind, however, that the 
categories presented in Tables 1 and 2 apply 
to a particular plant, in a particular town. 
Only more research will answer such questions 
as whether or not the patterns that were ob- 
tained are stable, what is the influence of 
situational variables on the patterns, etc. 


Summary 


Triads of jobs and people were presented 
to 105 Ss. The Ss were managers, clerks, 
and workers in a small New York State indus- 
trial concern. The Ss were asked “Which one 
of these three jobs (people) is more different 
from the other two?” and “Why?” The char- 
acteristics that differentiated one member of 
the triad from the other members were listed. 
Certain differences in the lists obtained from 
the various groups were observed. An at- 
tempt was made to assess the significance of 
these differences for intergroup communica- 


tions in industry. 


Received December 15, 1958. 


References 


Borgatta, E. F. Couch, A. S., & Bales, R. F. Some 
findings relevant to the great man theory of leader- 
ship. Amer. sociol. Rev., 1954, 19, 155-759. 

Bruner, J. S., Goodnow, J. Tage Austin, GAY A 
study of thinking. New York: Wiley, 1956. l 

Hayek, F. A. The sensory order. An inquiry into 
the foundations of theoretical psychology. Chi- 
cago: Univer. Chicago Press, 1952. 


344 Harry C. Triandis 


Kelly, G. A. The psychology of personal constructs. Parsons, T., Bales, R. F., & Shils, E. A. Working 
New York: Norton, 1955. papers in the theory of action. Glencoe, Ill.: Free 
Maslow, A. H. Motivation and personality. New Press, 1953. 
York: Harper, 1954. Tiffin, J. Industrial psychology. New York: Pren- 
Osgood, C. E., & Suci, G. L. Factor analysis of tice Hall, 1952. 
meaning. J. exp. Psychol., 1955, 50, 325-338. 
Parsons, T., & Bales, R. F. Family socialization and 
interaction process. Glencoe, Ill.: Free Press, 1955. 


Triandis, H. C. Some cognitive factors affecting 


communication. Unpublished doctoral dissertation, 
Cornell Univer., 1958. 


eee 


Journal of App 


lied Psychol 
Vol. 43, No. 5, 1959 e 


l 
In a recent study by Porter and Ghiselli 
(1957) the self-perceptions of top and mid- 
dle management personnel were compared. 
Using a self-description inventory developed 
by Ghiselli (1954), it was found that 21 of 
64 pairs of forced-choice adjectives signifi- 
cantly differentiated between the two groups. 
An analysis of the differentiating items indi- 
cated that top management personnel, when 
contrasted with middle management person- 
Ta viewed themselves as being more self-re- 
a Self-confident, enterprising, and bolder. 
he middle management individuals, on the 
Other hand, described themselves as more 
Careful, thoughtful, deliberate, and controlled. 
Ghiselli and Lodahl used the 21 differenti- 
*ting items in the Porter and Ghiselli study 
é form a scale which they termed a Deci- 
Igo; Making Approach (DMA) scale (1958a, 
th 58b). Although the scale is composed of 
© items that differentiated top from middle 
pegement personnel, it was given the name 
*cision-Making Approach because many of 
th items seemed to describe how individuals 
ese groups might approach the decision- 

sejp 28 Process if their behavior fitted their 
descriptions. Ghiselli and Lodahl (1958a, 
Prob, Point out, however, that “this scale 
Dr ably measures more than just the ap- 
vach to decision making. . . .” neste 
anq © DMA scale was first used by Ghiselli 
tiy, Lodahl in an investigation of group effec- 
tion SS ina complex task requiring coopera- 
: among group members to perform the 


| 


i 


t 
oe he Ss were male college students who 
o ced in groups of two, three, and four to 


inate a small model railroad by means of 
outtlated control switches. In order for a 
SP to perform effectively, the members 


t p 

Rop lhe authors wish to thank Karl Hakmiller and 

Study" Messman for their aid in the conduct of this 
ane , , 

Dlan ow at Pilotless Aircraft Division, Boeing Ait- 

e Company, 

34 


in 


RELATIONSHIPS BETWEEN A TOP-MIDDLE MAN- 
AGEMENT SELF-DESCRIPTION SCALE AND 
BEHAVIOR IN A GROUP SITUATION: 


LYMAN W. PORTER anv ROGER A. KAUFMAN °? 


University of California, Berkeley 


had to organize themselves so that the cor- 
rect switches were operated at the proper 
time; the task thus required coordination of 
the efforts of the several individuals. Ghiselli 
and Lodahl related the pattern of scores on 
the DMA scale achieved by the individuals in 
a group to the effectiveness of the group on 
the task. They found that the mean DMA 
score of a group was not related significantly 
to effectiveness, whereas the positive skewness 
of DMA scores in a group (i.e., “the differ- 
ence between the highest and next highest 
scorer, and this difference minus the range of 
the lower scorers”) correlated .82 with group 
effectiveness. It appears that the best per- 
forming groups were those in which one per- 
son scored high on the DMA (i.e., was to- 
ward the top management end of the scale) 
relative to the other members of the group, 
and where the lower scores of these other 
members formed a relatively small range. 

In a later study of industrial work groups, 
Ghiselli and Lodahl again found the pattern 
of DMA scores within a group to be impor- 
tant, this time in relation to the merit ratings 
of the foremen who directed the groups. It 
was found that there was a negative correla- 
tion (—.57) between a positively skewed pat- 
tern of scores of workers in a group and the 
merit rating given the foreman of that group. 
This negative correlation was interpreted as 
indicating that “when a foreman happens to 
be assigned to a work group which has con- 
siderable capacity for self management, higher 
management will regard him less well than if 
he happens to be assigned to a work group 
with little capacity for self management” 
(Ghiselli & Lodahl, 1958b, p. 185). 

The clear implication of both of the Ghiselli 
and Lodahl studies is that individuals’ DMA 
scores are indicative of certain types of be- 
havior in the work situation which affect the 


performance of a group. However, neither 


346 


study presented evidence as to whether an in- 
dividual actually behaves in the task situation 
as he describes himself on the DMA scale. 
The present study is aimed at this type of 
investigation. It is hypothesized that indi- 
viduals scoring high on the DMA scale will 
behave more like top management personnel 
describe themselves than will people scoring 
low. Specifically, in a situation requiring 
several individuals to interact with each other 
in order to perform a task, those high on the 
DMA should make a higher ratio of goal- 
setting, role-assignment, and task-performance 
suggestion comments to comments of an 
evaluative or cautionary nature, than should 
those scoring low on the DMA. It is fur- 
ther hypothesized that these differential be- 
havior patterns will be perceived by the group 
members themselves, and that therefore their 
ratings of their own group members on ques- 
tions dealing with these patterns of behavior 
will be in accordance with the members’ DMA 
scores. 


Method 
Subjects 


Sixty male undergraduate students, composing 20 
groups of three persons cach, served as the Ss in this 
experiment. 


Procedure and tasks 


At the beginning of an 
of the three Ss in a 


(c) Talk- 
(d) Certain restric- 


would be the total 
‘onstruction Periods. 
ilding periods and 


the intervening one-min. discussion periods, one E 
, E 


counted and classified the various co 

mment! 
by Ss to each other. (These data will be eer 
to as the Verbal Interaction data.) At the mii 
time, another E observed the physical behavior Ge 


Lyman W. Porter and Roger A. Kaufman 


ranked them after each building period on two dif- 
ferent types of physical behavior (discussed below 
under Physical Behavior). At the conclusion of the 1 


final building period the Ss filled out a questionnaire 
on which they were asked to rank the group mem- 


the Ss in the construction of the structures and | 


bers (including themselves) in response to each 0 


nine questions regarding their performance in the 
task situation. (These data will be referred to 4 
the Peer Ranking data.) 


Behavior measures 


The patterns of behavior that were predicted i 
accompany high DMA scores will be termed toP 
management behavior, and those that were predicto 
to accompany low DMA scores, middle managemen’ 
behavior. Although such contrasting patterns of a 
havior may not be unique to top and middle ee 
agement categories of individuals, the above, tem 
will be used because of the nature of the origi? 
the DMA scale. 

Verbal interaction. As previously noted, ts 
the Es counted and classified the various comme 
made by each S in each group's experimental wart 
Each comment or conversational unit was cee 
into one of the following three categories that 


1 i f i =i 
been set up in advance: (a) top management; com- 


one of 


conversational units included such types 0f, 
ments as goal setting ideas, directions, organiz 
ideas, assignment of roles, and major suggestions 
changes in procedure or attack on the building P 
lem; (b) middle management—these conversat nas 
units were remarks where the person evaluated $ 
of others, asked for clarification of others’ ideas, en 
elaborated on others’ ideas; and (c) other—this oad 
gory included all remarks which could not mea” te 
fully be classified into either the top or middle , 
gory. ch 
__A Verbal Interaction score was obtained fot oval 
individual by computing the ratio of the num f m 
his top comments to the number of his middle Tre 
ments. This ratio of top/middle was used in 9” ine 
sults involving Verbal Interaction data, an 
Primary behavioral data obtained in the study- the 
Physical behavior. While one E was recordin! rv" d 
verbal interaction of the Ss, another Æ was on tee 
ing the physical behavior of the Ss as they a i | 
s 


ation 


(A A 
= a 
seve 


to build their card structures. Due to the 
lack of discrete units of physical behavior 1° Fe he 
such as this, E ranked each of the three 58% ych | 
end of each two-min. building session on BOW © sted 
of each of two types of behavior the S had €* 1 be 
during the period. The two types of physic e 
havior that were rated corresponded as n be 
Possible to the top and middle types of y wore 
havior. These two types of physical behavior gjin 
(a) top—actively building the structure ÞY d oA 
the initiative in adding new parts to it; bs 
middle—assisting the construction by holdin 
m place while others placed new ones On FF y 
checking Parts of the structure to make sute ; 
stable, solid, etc. 


Management Scale and Behavior in a Group Situation 


A Physical Behavior score was obtained for each 


paul by weighting cach ranking (three points 
BS rank of one, two points for a rank of two, and 
a for a rank of three) for each two-min. 
then Be totaling the ranks for the six sessions, and 
f o taining a ratio of top ranks to middle ranks 
or the individual. 
Peer ranking measures 
Ea questionnaire administered to each s at the 
Sie sion of the six building periods contained nine 
ee ons concerning the past task performance or 
a Cee performance of the group’s mem- 
a n each question the S was asked to rank the 
= age members. Of the nine questions, six had 
oie a beforehand as referring to top man- 
middle types of behavior or traits, and three to 
Teo management. Examples of top management 
Da were: “Who in your group was the most 
ah cenu and enterprising?” and “Who in your 
Beet; Ww ould make the best president of a large cor- 
tions aed fn example of middle management ques- 
midal was: “Who in your group would make the best 
er © management executive (e.g, production man- 
„Scr, Personnel manager, sales manager, etc.) ?” As 
that a rankings on physical behavior, each ranking 
Person received was weighted and total rank 


Sco; . 
res for top and middle were obtained. Again, a 


ratio- c ; 
See top/middle score was obtained for each 


Gr 
oup production measure 


Although this study was concerned with the rela- 


ti A 
Bote between individuals’ DMA scores and as- 


task Of their behavior, the nature of the group’s 


Ean it possible to obtain a crude measure of 
Other Productivity that could be compared with 
Cedu, measures of group behavior. The scoring pro- 
i re described above that was presented to the Ss 
eir instructions was used as the measure of 
P productivity. This measure, however, proved 
| 
i 


Btoup relatively unreliable (r= .40), and therefore 


in th 


Brou 
t 


? Production results will not be further reported 
is study, 


Results 
i Table 1 presents the correlations between 
thej, uals’ DMA scores and measures of 
toD h, ehavior in the task situation. In the 
DL half of the table, all correlations are with 
t table êW scores. The bottom half of the 
DMa” resents correlations where individuals 
Viduape ores are standard scores, each indi- 
Stang, °,8core being based on the mean and 
i ard deviation of DMA scores in his 
. P of three individuals. 
| Doth, € correlations that test the primary hy- 


Toy 


Dy sis of this experiment are those between 
Scores and Verbal Interaction scores. 


347 


Table 1 


Coefficients of Correlation of Individuals’ DMA Scores 
With Measures of Individual Behavior 
and With Peer Rankings 


W = 60) 
p 
Variables correlated with DMA 
raw scores 
Verbal interaction (raw scores) 34 01 
Physical behavior 04 a4 
Peer ranking 25 05 
Variables correlated with DMA 
standard scores 
Verbal interaction (standard scores) 2 001 
Physical behavior 18 — 
Peer ranking .36 OL 


These correlations between the top-middle 
management scale of the SDI and the ratio 
of top to middle management type of verbal 
interaction are .34 (p= .01) where raw 
scores on both measures are used, and .52 
(p = .001) where standard scores are used. 
Not only do high DMA scorers behave ver- 
bally in a more top management manner, but 
they are also seen as doing so by their peers. 
Peer rankings correlate .25 (p= .05) with 
DMA raw scores and 36 (p=.01) with 
DMA standard scores. The Physical Be- 
havior measure correlates insignificantly with 
both raw and standard DMA scores. (Only 
standard scores are used throughout Table 1 
for Peer Ranking and Physical Behavior since 
both measures involve rankings of each indi- 
vidual relative to the other two members in 
his group.) 
Discussion 


The results of this study confirm the hy- 
potheses stated in the introduction. They 
show that scores obtained on a scale devel- 
oped from the differential self-perceptions of 
top and middle management personnel are 
significantly related to the type of behavior 
that a person exhibits in an actual group 
situation where some task must be performed. 
Specifically, those individuals who score higher 
on the DMA scale behave, in terms of verbal 
interaction with others, relatively more like 


348 


top management personnel describe them- 
selves than do persons who score lower on 
the DMA scale. Also, those individuals who 
score higher on the top-middle management 
self-description scale are seen by their peers 
in the group situation to behave and appear 
relatively more like top management indi- 
viduals than do those who score lower on the 
scale. The results would appear to indicate 
that the DMA scale has some validity for 
predicting an individual’s behavior in a situa- 
tion where top or middle management types 
of verbal action are relevant. 

One particular aspect of the correlations 
between individuals’ DMA scores and meas- 
ures of their behavior that are presented in 
Table 1 deserves further comment. It will be 
noted in this table that the correlations are 
consistently higher when DMA standard 
scores rather than raw scores are used. This 
indicates that predictions made from the 
DMA regarding a person’s actual behavior in 
a situation will be more valid when something 
is known about the DMA scores of the other 
individuals in the group. This finding lends 
support to the more general notion that some- 
thing must be known about the characteristics 
of all the members of a group if knowledge 
of a particular individual's traits (whether 
gained through self-description or otherwise) 
is to be used most accurately to predict hi 
behavior in a social situation, An individual's 
Possession of particular traits or character- 
istics is important in determining his behav- 
ior in a group situation, but also important is 
the distribution and Strength of his charac- 
teristics in relation to the distribution and 


strength of characteristics Possessed by others 
in the group. 


S 


Summary 


Previous studies have described the devel- 
opment of a top-middle management scale de- 


Lyman W. Porter and Roger A. Kaufman 


rived from the differential answers of topan 
middle management personnel to a T woe 
scription inventory. The present stu Sivas 
designed to determine whether indivi es 
whose answers on this self-description DA 
tory give them a high score on the Opek 
dle management scale behave in a group yam- 
in accordance with their relatively “top ”" 
agement” self-descriptions. DRCT 
“Sixty male undergraduates serving ee 
of three were the Ss in this ir fie 
Each group of three Ss first complete T 
self-description inventory individually wing 
then participated in a group task ogee 
building a structure out of 8 X 11-in. 0- 
board cards. The task required grouP ei, l 
operation and considerable verbal intera task | 
The comments of Ss while performing vere ron 
were categorized by one of the Es aa 
or middle management types of remark i s 
the conclusion of the group task sessio a ith 
ranked their peers on questions aea ME oE, 
top and middle management types of be cores 
The results showed that individuals, ake 
on the top-middle management self-de wit 
tion scale were correlated significantly in the 
the type of verbal interaction exhibited cri 
experimental session, Also, the ene 
tion scores correlated with the peer ran i} 
made by the group members. 


Received February 16, 1959. 


REFERENCES . 
GuiseLu, E. E. The forced-choice technique A _20 
description. Personnel Psychol., 1954, 1, rns ” | 
Guisrtu, E. E., & Lopant, T. M. Patte ab- 
managerial traits and group effectiveness. 
norm. soc. Psychol., 1958, 57, 61-66. E ius 
Gnister, E, E, & Lopant, T. M. The ge i t 
of foreman’s performance in relation to Ë person” 
nal characteristics of their work groups- 
nel Psychol., 1958, 11, 179-188. (b) 1f-peree?” | 
Porter, L. W, & Gmrsertr, E. E, The se onnel 
tions of top and middle management PC 
Personnel Psychol., 1957, 10, 397-406. 


n self- 


Journal of Applied Psychology 


VoL. 43, No. 6 


DECEMBER, 1959 


EMPLOYEE ATTITUDES TOWARD TECHNOLOGICAL 
CHANGE IN A MEDIUM SIZED INSURANCE 
COMPANY `+ 


EUGENE JACOBSO 


2 DON TRUMBO,* GLORIA CHEEK, ann JOHN NANGLE 


Labor and Industrial Relations Center and Department of Psychology, 
Michigan State University i 


See ar aspect of employee adjust- 
which echnological change is the way in 
His a ne employee experiences the change. 
it alee a of what is happening and how 
Spams him can be expected to influence his 
on to the change. When the employee's 
it is kn ace of the change and his response to 
to iene n, it is equally important to attempt 
over possible determinants of these 
Phenomena, 
on i Paper is the first in a series reporting 
effect = of studies designed to explore the 
tön the supervisory practices, communica- 
ory puployas personality, and employee his- 
we di n employee response to change. In it 
Pie some initial findings about office 
computi Tesponse to the introduction of a 
sen i machine. Other reports will pre- 
fee naterial about possible determinants of 
Se responses. 


Technological Change in the 
Office Situation 


a basic assumption of these studies is 
2t change is always occurring in work situa- 
Cts d is part of a series of proj- 
Cony onducted in the Labor and Industrial Relations 
er at Michigan State University under the guid- 
of Jack Stieber. Einar Hardin, Economics De- 
R the ait cooperated with the authors in all phases 
n jo study and is reporting, in other publications, 
dissert, changes and employee response. Doctoral 
Dareg ations based on these studies have been pre- 
‘of Jai 9y the junior authors under the supervision 
g ae Karslake, Department of Psychology. 
Unive - Jacobson is on leave from Michigan State 
Nee pee as Chief, Division of Applied Social Sci- 
» Department of Social Sciences, SCO. 


1 The rese 
Š Ae research reporte! 


ance 
Dart 


tions. In the contemporary office in the United 
States the increased use of computing ma- 
chines is making the study of change perhaps 
even more relevant than the corresponding 
introduction of automated work processes in 
the factory, where there is a longer history of 
adaptation to these devices. 

Even in the larger offices, there has been 
some opportunity for experimenting with the 
new equipment. But it is only recently that 
the: smaller companies have begun to use the 
complex data handling machines and the 
smallest still are not able to afford this equip- 
ment, When a medium sized company that 
has been using traditional data processing 
methods installs some of these machines it 
might be expected that employees would be 
very much aware of the change. It is this 
kind of situation—a medium sized insurance 
company using its first electronic data com- 
and storing procedures—that we chose 


puting 
r first study of employee response to 


for ou 
change. 


The Research Site 


The company employs about 500 persons, 
300 of them housed in a single central home 
office building, all engaged in activities re- 
lated to selling and servicing insurance poli- 
cies. Eighty percent of the nonsupervisory 
employees are women. Half of the women 

on the staff of the Depart- 


3 Dr. Trumbo is now 
ment of Psychology, Kansas State College, Manhat- 


tan, Kansas. 


349 


350 


are married. Sixty-five percent of the non- 
supervisory employees are less than 35 years 
old. About half of the nonsupervisory em- 
ployees report that they are neither the only 
wage earner in the family nor the main wage 
earner and that their household could live 
adequately if they were not working. Al- 
most all of the employees have lived in the 
same geographical area most of their lives. 
Almost all have high school educations, and 
about a quarter have some additional formal 
education. Forty percent of the nonsuper- 
visory employees had worked for the com- 
pany for a year or less at the time of the 
study. 

The company is located in a city of about 
100,000 population, mixed industrial, com- 
mercial, and government with a large num- 
ber of offices employing clerical workers, The 
company has a reputation as a good place to 
work, a new modern building, and many bene- 
fit plans. It maintains pay scale at commu- 
nity level, has had a continuously expanding 
work force with no layoffs, and is perceived 
as a prosperous and growing organization. 

The company was organized in the early 
1900’s, had a relatively slow growth until the 
middle of 1940’s when it began to expand 
rapidly and is still increasing in size, Until 
the early 1950's the office was operated along 
traditional lines. There were some office ma- 
chines used but they were not central to the 
entire work Operation. As the business ma- 
chine companies produced more and more 
elaborate data processing devices, the com- 
pany became more self-conscious about its 
work procedures and has had a history of re- 
examining work flow and adjusting methods 
to allow maximum use of the new equipment, 

But all of this change was at a relatively 
slow rate as compared with the changes in- 
troduced in August 1956, when a medium 
sized electronic computing device was in- 
stalled. This machine can 


store and selec- 
tively reproduce data, perform a series of 


complex, related operations, and provide data 
for other machines. It is the response to the 
installation of this major technologic. 


al inno- 
vation that we studied in February 


1957. 


E. Jacobson, D. Trumbo, Gloria Cheek, and J. Nangle 


Procedure 


Before and during the period of installation a 
computer we had regular contacts with a ia 
pany, talking with the staff and line e aie 
were planning and supervising the change. in July 
chine was physically in the company’s offices i aie 
1956. By December 1956 it had gone into G and 
tion on a major processing operation. By t i for 
of February 1957 the computer had been in us 
about three months. 
> that time we administered pencilad pane 
questionnaires to all of the home office aE paut 
below the level of the Board of Directors. * es 
230 nonsupervisory and about 50 SEREDAN E 
ployees, assembled in three groups on Bwa iaa out 
sive days in the company’s meeting room, ae o! 
the hour long questionnaires. This is about Antoi 
the total home office population. An cam oe 
of nonresponse showed no significant biased a 

Questionnaire material included items on a ions 
to change, response to the computer, SUED ake 
job satisfaction, communication, and sn aires 
ground and personal history. The questo Joyees 
for the supervisory and nonsupervisory pr 
were essentially identical, except for items on ntary 
vision. These were framed in a complemey ut 
fashion, so that employees we responding “were 
behavior of their supervisors and supervisors 
responding about their own behavior. 


nse 


Results 


ave 
From these questionnaire materials we oe 
selected four facets of the nonsupervisory ion: 
ployees’ response to change for mane iro 
the employees’ perception of the genera iter 
pact of the installation of the new comp" a 
on their jobs, the employees’ perception of- 
the impact of machines in general on aa 
fice situation and jobs, the employees eS) 
eral attitudes toward technological aE en! 
and the employees’ perception of what ; their 
pening to a number of specific aspects 0 
jobs because of technological change. 
In broad summary, these findings in! tive 
that the bulk of the employees are SO 
to the change that has occurred, see it aS an 
ing important effects on their own jobs, i 
on the opportunities for employment 17 ork- 
Occupational field. They recognize that i d 
ers are being replaced by machines p ; 
not feel that they themselves will be a that 
About a quarter of the employees fee king 
new developments in technology are ta hey 
place more rapidly than is desirable, but 


dicate 


Employee Attitudes toward Technological Change 


themselves welcome change in their own jobs. 
They report that the kinds of changes that 
are traceable to the new equipment have to 
do with amount of work and variety and in- 
terest of the work, but do not believe that 
Pay, promotion, or supervision have been af- 
fected by the change. 

In detail, the findings are as follows: 

How did the changeover affect the em- 
ployee? About one employee in three reports 
that the changeover to the computer had a 
telatively marked effect on his job. Two per- 
cent say they were promoted, 4% that they 
Were transferred, and 27% that they kept 
the same jobs but the work was noticeably 
Changed (Table 1). 

About 6% report dislike of the effect of 
the changeover, about one-half report that it 
made no difference to them, and 40% that 
they like the effect of the changeover. 

About one employee in three reports that 
the effect of the changeover was quite dis- 
Tupting, and the remainder that it was only 
Slightly or not at all disrupting. 

When asked to anticipate the effect of the 
Computer on their jobs in the next year or 
two, about 40% of the employees see that it 
18 probable that the computer will have some 
influence on them. About the same percent- 
age do not believe that the computer will 
affect their jobs. 


Table 1 


What Effect Did the Changeover (to the New 
Computer) Have on Your Job? 


Percentage N 


ee Seakan o O s 
Was promoted 2 5 
yes transferred to another job 4 
ae the same job, but the work e 
Was greatly changed 10 
as the same job, but the work r p 
rae Noticeably changed 
ak the same job, and the work 2 

11. Slightly changed 18 

CDt the same job, and the work p 
ye not changed 45 104 
i 4 9 
100 232 


351 


Table 2 


Are the Chances That a Machine Will Replace You on 
Your Job Greater or Less Than for Most Jobs? 


Percentage N 


Much or somewhat greater than 


for most jobs 13 29 
Somewhat less than for most jobs 30 69 
Much less than for most jobs 51 119 
NA 6 15 

100 232 


What are machines doing to jobs? About 
one quarter of the employees believe that 
machines have changed the nature of their 
jobs to a fairly large extent in the past two 
years. An additional 47% perceive some 
change in their work because of machines. 
This does not necessarily mean that the em- 
ployee is reporting a change in task. He 
may be reflecting what he senses to be 
changes in how his job fits in with others or 
the general work atmosphere. Eighty percent 
of the employees who perceive quite a bit of 
machine induced change say that they like it. 

About 1 in 10 believes that machines have 
replaced workers to a large extent in insur- 
ance companies in the past two years. An- 
other 60% say that machines have replaced 
workers in insurance companies to some ex- 
tent. About half of those who see this dis- 
placement occurring express themselves as be- 
ing indifferent to it, about a quarter approve 
and 1 in 5 disapproves. 

But although three quarters of the em- 
ployees believe that machines have replaced 
workers in insurance companies, about 80% 
feel that the chances that they themselves 
will be replaced by machines are less than for 
most jobs (Table 2). 

Those who foresee least likelihood of their 
being replaced are more happy with their 
predictions than the others. l 

When asked to estimate what will happen 
to the total number of people doing their kind 
of job in the next five years, about 4 in 10 
see an increase, another 40% see the number 
remaining about the same and only 1 in 8 


sees the number of people doing his kind of 


Table 3 


{í { i You Would 
y b That You Would Consider Ideal for You Wou 
ace E One Where the Way You Do Your Work: 


Percentage N 


Is always the same 3 7 
Changes very little 7 7 
Changes somewhat H 101 
Changes quite a bit 28 65 
Changes a great deal 18 41 
NA = 1 

100 232 


work decreasing. Those who do see decreased 
opportunities in their kind of employment are 
less happy. 

What does the employee think about tech- 
nological change? As a general comment on 
the impact of machines on jobs, about 1 em- 
ployee in 4 believes that new developments 
in machines and methods for doing work are 
taking place more rapidly than is desirable. 
About half feel that the rate is satisfactory 
and only 1 in 6 feels that it is too slow. 

When asked what kind of job would be 
ideal for them, only 1 in 10 reports that he 
wants a job where the way the work is done 


E. Jacobson, D. Trumbo, Gloria Cheek, and J. Nangle 


changes only a little or not at all. oor 
percent want the work to change eee 
or “quite a bit,” and 1 in 5 wants the worl 
change a great deal (Table 3). 

Those employees who believe that PE 
changes take place in the way they do $ a 
jobs than is true for the average task i 
more likely to say that they approve of ka 
state of affairs than are the employees co 
say that their jobs tend to have fewer chanb 

About 40% of the employees would Hen 
have a large part or all of their work invo: id 
the use of office machines. One in eight ua 
prefer not to have his work involve the u 
of machines. “ee 

About 40% of the employees do no Te 
that their kind of job will require any a of 
use of machines by 1960. However, on bs 
the employees do see that their jobs wil p 
quire more use of machines and almost 2 
reports that their jobs will require less use 
machines. ae 

What aspects of the job has the compu 
affected? In Table 1 we found that oa 
one third of the nonsupervisory employees 
the company studied believed that the on 
puter had created significant changes in A 
jobs. To determine the kinds of changes es 
employees believed were taking place, q" 


Table 4 


Has This Aspect of Yi 


our Job Changed in the Past Year? 


“Yes, more now”: 


: ct 
Did the changeover to the computer afie 
this aspect of your job? 


The amount of variety in my work 

The degree to which my work is interesting 
The amount of work required on this job 
The amount of responsibility demanded 
The amount of skill needed on this job 
The degree of accuracy demanded by this job 
The amount of security I feel on this job 

My chance for promotion to a better job 
The extent to which I can pace my own work 
The amount of pay I get on this job 
The amount of supervision I get 


on this job 


eed 
7 Total _ 
—— N 

Yes No NA Percentage i 
ala 02 
26% 16 2 “o i 
19% 23 2 44 99 
21% 21 1 43 92 
18% 20 2 40 88 
17% 19 2 38 88 
19% 17 2 38 78 
12%, 21 1 34 75 
13% 17 2 32 69 
12% 16 1 29 66 
6% 22 1 29 32 

6% 8 s 


h 


Employee Attitudes toward Technological Change 


tions were asked about 11 aspects of the work- 
ing situation. For each of these 11 aspects, 
three questions were asked: 


“Has this aspect of your job changed in the past 
year?” 

“How do you feel about this change (or lack of 
change) in your job?” 

“Did the changeover to the computer affect this 
aspect of your job?” 


Because the bulk of the employees, about 
two thirds, did not report any major change 
in their work, we will first examine the dif- 
ferences in “no change” answers among the 
11 aspects of the job. 

Employees are more likely to report that 
there has been “no change” in the amount of 
Supervision they receive, the amount of pay 
they receive, and their chances for promotion 
than that there has been “no change” in the 
amount of work they do. 

When asked directly whether the change- 

Over to the computer had affected these as- 
Pects of their jobs, an even larger percentage 
responded “no.” When we examine these 
no” responses we find roughly the same or- 
ering, Employees are more likely to say 
that the computer did not affect the level of 
their pay than that it did not affect the 
amount of work they do or the variety in 
their work, 

Turning to the employees who did report 
Change, in Table 4 we have an analysis of the 
response “yes, there is more now” for each of 
the 11 aspects of the job. First, we find that 
“Mployees are more likely to report an 1m- 
Crease in the variety of their work, the de- 
Stee to which it is interesting, and the amount 
Of work than they are to report an increase 

amount of supervision or Pay. — 

ut there is no simple relationship be- 
tween perceived increase in these aspects and 
imputed effect of the computer. The rela- 
ively small number of persons who report a 
kay increase in general do not attribute it to 
the introduction of the computer. The = 
tively large number who see that there Is 
the variety in their work do attribute it 2 

Computer. Most of those who report 
pote security do not feel that the cape eee 

esponsible. In the other aspects, abou 


353 


half of those who report change attribute it 
to the computer. 

Among the small number of persons who ` 
report decreases in various aspects of their 
jobs, the computer is slightly more likely to 
be credited with decreases in amount of work, 
and slightly less likely to be credited with de- 
creases in amount of supervision. 


Summary 


Questionnaires about technological change 
and the installation of a new electronic com- 
puter were administered to all of the em- 
ployees of a medium sized insurance company. 

About one third of the nonsupervisory em- 
ployees reported that the introduction of the 
computer had affected their jobs. Most of 
the employees welcomed changes in their 
work, although they thought that changes 
were taking place somewhat too rapidly in 
the world in general. Most of the employees 
like to work with machines and expect more 
use of machines in the future. They believe 
that machines are replacing workers in office 
situations but do not feel that they them- 
selves will be replaced. They do not perceive 
that the introduction of the new technologies 
has had much effect on the amount of pay 
they get, their chances for promotion, or the 
amount of supervision they receive. But they 
do believe that the new technologies have 
changed the amount of work that they do and 
the degree to which there is variety in their 


work. 
BIBLIOGRAPHY 


Becker, Esruer R, & Murruy, E. F. The office 
in transition. New York: Harper, 1957. : 
Cocn, L., & Frencu, J. R. F., Jn. Overcoming re- 
sistance to change. Hum. Relat., 1948, 1, 512-532. 
Cratc, H. F. Administering a conversion to elec- 
tronic accounting; a case study of a large office. 
Boston: Div. of Res., Grad. Sch. of Bus, Admin., 
Harvard Univer., 1955. 3 
Jacosson, E. H. Some social psychological aspects 
of employee response to technological change in 
Proceedings of the Fijteenth International Congress 
of Psychology—Brussels 1957. Amsterdam: North 
Holland, 1959. ; cs. F 
Jacosson, E. H. The effect of changing industrial 
` methods and automation on personnel in Sym- 
posium on preventive and social psychiatry. 


354 


Washington, D. C.: Walter Reed Army Medical 
Center Documents, 1959. 

KILLINGSWORTH, C, Automation in manufacturing. 
Paper read at Industr. Relat. Res. Ass. meeting, 
Chicago, December, 1958. 

Levin, H. S. Office work and automation. 
York: Wiley, 1956. 

Many, F. C. Studying and creating change: A 
means to understanding social organization, In 
Research in industrial human relations. New 
York: Harper, 1957. 


New 


E. Jacobson, D. Trumbo, Gloria Cheek, and J. Nangle 


Stieser, J. Automation and the white collar worker. 
Personnel, 6, 1958. 

STIEBER, J. The impact of automation on white- 
collar workers, Paper read at the Midwest. Econ. 
Ass. meeting, May, 1957. 

Trumeo, D. A. An analysis of attitudes toward 
change among the employees of an insurance Com- 
pany. Unpublished doctoral dissertation, Michi- 
gan State Univer., 1958. 


(Early publication received April 29, 1959) 


oe 


| 


Journal of Applied Psychology 
Volo 43, No. Ge tasg TE osy 


A STATISTICAL EVALUATION OF EDWARDS 
PERSONAL PREFERENCE SCHEDULE ' 


EDWARD LEVONIAN, ANDREW COMREY, WILLIAM LEVY, axp DONALD PROCTER 


University of California, Los Angeles 


The Edwards Personal Preference Schedule 
(PPS) has been used widely in both the ap- 
plied and research fields as a means of meas- 
uring 15 “normal” personality variables (Ed- 
wards, 1954). To obtain information about 
the factor structure of the PPS, factor analy- 
Ses were carried out for the items within each 
Scale. The results of these 15 factor analyses 
Constitute the subject of this report. 

The PPS was designed to measure the 
Strength of personal need in the following 
Scale variables: Achievement, Deference, Or- 
der, Exhibition, Autonomy, Affiliation, Intra- 
sePtion, Succorance, Dominance, Abasement, 
Nurturance, Change, Endurance, Heterosexu- 
ality, and Aggression. The test consists of 

25 items, each of which contains two alter- 
Nate statements from which the subject is 
Supposed to choose the one more nearly char- 
acterizing himself. The statements are sup- 
Posed to be equal in social desirability. Fif- 
teen items occur twice to allow a check on 
respondent consistency. 

Except for the duplicate consistency items, 
tach item is scored on two of the 15 scale 
Variables, Thus, 29 items are scored for any 
Siven scale, but 28 of these items are also 
Scored for another scale, two items for each 
A the other 14 scales. Each of these items 
s SO phrased that one of the two available 
sponses is scored positively for the scale in 
luestion, whereas the alternate response is 
Scored Positively for the other scale variable 
"Pon which the item is scored. This system 
ayes the respondent a “forced choice” in 
which he can appear high (or low) in some 
Ut not all variables. 
it Apart from the 15 identical consistency 

cms, other items are related to one another 
State itue of being half identical. An iep 
R may appear in several items 
R. Marshall, 


partici- 
ufi- 


in 

Fr; The authors are indebted to J. 

ed S. Taylor, and Hilde Groth, who 

tient] in the early stages of the research but not s 
Y to warrant coauthorship. 


3. 


opposition to a different alternative state- 
ment in every case, except for the consistency 
items which repeat both statements. Any 
two items in this test, therefore, can be clas- 
sified into one of five kinds of relationships 
that can have an important effect upon fac- 
torial structure. The five possible relation- 
ships are: 

Reciprocal consistency pair. When two 
items bear this relationship to each other, 
they are identical. Each scale has two and 
only two such items, e.g., Items 1 and 151, 
resulting in one and only one such pair. The 
pair is reciprocal because both items offer a 
choice between the same two scale variables. 

Reciprocal iterative pair. Items in this 
group form pairs which offer the same choice 
between scale variables to the respondent, 
ie., Achievement vs. Order, but in addition 
are half identical. That is, two such items, 
e.g., Items 3 and 11, will include a common 
statement between them, although the other 
statement will be different in the two items. 
Fifty-two such pairs appear in the test. 

Reciprocal diverse pair. These items offer 
a choice between the same two variables but 
do not share a common statement. All four 
alternative statements in the two items, e.g., 
Items 2 and 6, are different. Fifty-three such 
pairs exist in the test. 

Nonreciprocal iterative pair, These items 
share a common statement but do not present 
a choice between the same two scale vari- 
ables. Due to the overlapping statement, one 
of the variables will be the same in both 
items, e.g., Items 6 and 27, but the opposed 
variable differs as well as the statements. 

Nonreciprocal diverse pair. A pair in this 
category shares no common statement, nor 
does it offer the same variable choice for the 
two items, e.g., Items 2 and 18. 

From a consideration of what the PPS is 
supposed to measure and certain aspects of its 
composition, the following results might be 


expected: 


55 


356 


1. Substantial interitem correlations within 
scales should emerge. The 29 items for each 
scale are supposed to measure that scale vari- 
able, hence they should show substantial in- 
tercorrelations, resulting in one factor com- 

items. 

a, SiS doublet factors should emerge. 
Since every item on a scale is also scored on 
some other scale and there are two items for 
each other scale, a factor should appear for 
each pair of items measuring the same scale 
variable other than the one being analyzed. 
This would give 14 doublet factors on each 
scale. 


3. A consistency factor should emerge. 
Variance specific to the two identical items 


should appear as an essentially specific fac- 
tor on each scale. 


Procedure 


To test these expectations, each of the 15 scales 
was factor analyzed independently of the other 
scales.” Data consisted of the responses of the first 
360 Ss chosen as every fourth case from the 1509 
cases in the original normative sample.3 Phi coeffi- 
cients were used in the correlation tables since phis 
have been found to yield more satisfactory results 
in factor analytic work than other commonly used 
point coefficients (Comrey & Levonian, 1958). Fac- 
tors were extracted by the complete centroid method 
until at least three successive factors failed to yield 
a loading as large as 25. These factors were rotated 
analytically using Kaiser’s (1958) normal Varimax 
method, an orthogonal method which tends to maxi- 
mize the variance of the Squared extended vector 
projections by pairs of factors over all Possible pairs. 
Iteration is continued until an acceptable converg- 


ence of the solution has been achieved. No hand 
rotations were made, 


Results 


It will be impossible to present even jn 
summary form the actual results of 15 factor 
analyses in so little space. Only certain out- 
standing features and general implications of 


the results will be treated. Those interested 


° All computations were performed on SWAC, an 
electronic digital computer operated by Numerical 
Analysis Research at the University of California, 
Los Angeles, and supported by the Office of Naval 
Research. The opinions expressed here are the au- 
thors’ and do not necessarily reflect those of the 
United States Navy. . , 

3 Professor Edwards kindly made available the re- 
sponse sheets of the 360 Ss. 


E. Levonian, A. Comrey, W. Levy, and D. Procter 


in more details may examine the complete 
documents available through the ADI.* 

First of all, the intercorrelations bewei 
the items within scales are generally Ti 
As an example of the interitem ie 
encountered, the correlations for the i 
scale (Achievement) are given in Table ‘ 
While most of the correlations are ong 
the average correlation is less than .08, n° 
significant. 

"The interitem correlation table for a 
achievement scale is typical of the correla =e 
tables for the other 14 scales. Although t i 
Kaiser Varimax method of factor ee 
does not tend to favor the emergence ae 
general factor, it is doubtful if any metho ane 
factor analysis would give a very strong § 5s 
eral factor from tables of intercorrelatio 
such as these, tin- 

The second major finding showed tha sch 
stead of having 14 doublet factors o 
scale determined by reciprocal pairs of ite ae 
supposedly measuring the same scale way 
ables, as was expected, factors were a 
determined by the repeated statement. PPS, 
are 105 reciprocal pairs of items in the 2 
each occurring in two analyses. Of these oe 
pairs, 104 are of the iterative type, ecg 
common statement, and 106 are of the S 
type, not sharing a common statement. y ther 
73 iterative pairs of items appeared toge di 
on the same factor, only 21 diverse pairs e 
so. Of the 161 factors with two or aa 
loadings of .30 or more, only 30 contained © 
common statement among their items. oca 
half the factors were loaded by recip" the 
pairs and three-fourths of these were i fac- 
iterative type. Only about 10% of al airs 
tors were loaded by reciprocal diverse P a 
and these were outnumbered 2 to 1 ee 
tors which were loaded by nonreciprocal Í 
tive pairs. 


aod with 
*The following tables have been deposited in, 
the American Documentation Institute: (1) rotated 
tercorrelations for each of the 15 scales, | correla 
factors for each of the 15 analyses, (3) interco. ers? 
tion among reciprocal iterative and reciprocal airs. 


pairs, and (4) factors loaded by reciprocal xiliaty 
Order Document No. 6078 from the ADI AY% yj 
Public; 


ations Project, Photoduplication Servi iting 
brary of Congress, Washington 25, D. C+ szo i 
in advance $2.25 for 35 mm. microfilm or tical aa 
6X8 in photocopies readable without opti jcati® 
Make checks payable to Chief, Photodup 
Service, Library of Congress. 


in 


A Statistical Evaluation 


MUO Log Javy SJuIOd Pup POPU BUNOS Sqq 7] M VO paseq Sp waz YOR 07 JANLUIJL aaysod IJ, v 

TI 60 £0 ZO 80— SO 90 60— It— z£ ZI £0 £010 00 LE TřI— 60 90 FO H OF 10 WE W 2 2 Sst 
10 11 60 #}0— F091 90 £0 +0 9e @- 6040 10-40 8070—- 4 90 @ £0 £0 oz TE 10 20 œ PSI 

60 SO— 40 4000 t0- +0 40 60 of 82 40 ££ 8E 070 60 SO 60 FE LT £0— 90 Ss to TW- ‘Sot 

ZI Z0 ZI 2090 80 80-2 2 ft HE HW u- 6060 OF of z WT zz oz co— 12 30 90 TST 

z0 90ST 10 +E zZz0— 60 +0 +097 00 £0- S7 90— €T LE OL 20- 10 80 so 680 SO $e TST 

90 90 10 10 00 #0- OL £0 40 st 0-008 ff 2 10 ef S £0 00 £0— ř0— 00 08 

z0 10 60 £0— F0- 40 L0F0 40 12 OF OT Ef £0— 8I LO FI 10) S # Or #0- 80 6L 

st 00 Z0 TE £0 60I Zt £0 OZ 10 80 LO @ 60 Z0- £0 10 9 %9 H BL 

st +0 80 SO SOT £0 SO 80-1140 10-27 60 4/0 Zz- FI 90— 10— 20 Ft LL 

10— SO #0 90ST Z0— 90- £190 €T 60 10 10- 20 TI 20 wW a ££ 9L 

60 10— 00 £1— £0— ¿z 0 OI— £0— 90 90— 90 #70 £0 cz T0O- 80 90- TL 

sO 91 20 SO 10 %@90- 7I 6t 40 80 10- 61 40 T w 99 

’ 9T OT et 40 80TF 80 80 mH Of OF z0 7w o Mm 0 19 

8380 Z £0 TNT st OF Z 8% OF (ay) so st TW L0 9s 

Ol 60- STO I7 £0 60 c- 10 LO 710 40 %0 22 Ts 

cO— 41 40 It FO OF w OF (0) co & ‘ST Ot oF 

ST Z70— 90 So— 90 ef 2 00 oF 10 @- 2- Th 

or et 90 St 60 00 £0 6t sO 2 20 9¢ 

90 +0— 90 40 7 co so— "30 10 OF Te 

80 tw LO + LO oz 10— 00 6l 9% 

80 40 60 ee 10— 70 30 OF IT 

o n 80 or 40 30 €I 91 

+1 70 10 WW sgt %0 u 

. 10— coo 40 €0 SO- 9 

ToO— c 0 H s 

60 £0— so Li 

9. I0- £ 

m~ g 

T 
FST EST ZST IST OS 6L BL LL of T 99 T9 OF TS OF TT oF TE 9% Te OF wT 9 £ ¥ £ T T ‘ON 


v9]BIS JUIWMAAITYDY IYI 107 syUaIITYaoD Tyg wayazuy 


TAQEL 


358 
Table 2 
Comparison of Factor Loadings for Two 
Methods of Scoring 
Alternative , 
Scored as 1 Factor Loadings 
Con- Scoring Con- Scoring 
Item sistent Key sistent Key 
u AB AB Ot 63 
3 AB AI —.60 -O4 
46 AB AB 2 H 
153 AB AB —.39 44 
56 AB AB 33 35 


The third expectation, namely that a con- 
sistency factor would appear, was verified in 
every scale. The consistency factor, however, 
instead of taking only the specific variance 
left over, was usually one of the major fac- 
tors in the analysis, 

These factor results, and the ADI tables, 
are based on an item scoring method in which 
the B alternative to each item was scored as 
1, but essentially the same results would be 
obtained if the items were scored according 
to the key issued with the EPPS. For each 
scale this key scores the A alternatives as 1 
for half the items, the B alternatives as 1 for 
the remaining half, To demonstrate the rela- 
tive unimportance of the scoring method em- 
ployed, the first scale (Achievement) was fac- 
tor analyzed again using key scoring. Table 2 

i results for the five items 
ngs on a corresponding 
factor for both types of scoring: (a) the con- 

sistent method with the B alternative consist- 
ently scored as 1, and (b) the scoring key 
method. The alternative scored as 1 is un- 
derlined. It is seen that the main effect of 
reversed scoring is to change the sign of the 
factor loading; however, the interpretation of 
the factor remains the same, since this inter- 
pretation takes into account the manner of 
scoring. 

In order to gain some understanding of the 
basis for the anomalous factor results, the in- 
teritem correlations were investigated accord- 
ing to the five different types of item pairs. 
The order of magnitude of phis from top to 
bottom for the various types of pairs was: 


E. Levonian, A. Comrey, W. Levy, and D. Procter 


consistency pairs, reciprocal iterative ae 
honreciprocal iterative pairs, reciprocal 
verse pairs, and nonreciprocal diverse os 
The median values were, respectively, oh 
30, .21, .17, and .00. The fact that = 
median correlation was greater for aque 
items not supposed to be measuring the san 
two variables than for the reciprocal ive 
items, which are supposed to be ears 
the same two variables, closely parallels 
factor results. The low values of the aie 
relations in general and particularly for d 
identical consistency items should be aoea 
The range of phi coefficients for the identi n- 
consistency items was .30 to .68. No Sgi 
identical item pair had a phi larger than - 


Discussion 


The results of these analyses reveal tat 
unexpectedly large discrepancy between W : 
the PPS is designed to measure and the actu g 
item factorial content, Instead of oe 
large factors which are readily identi 
along the lines of the major variables sco! i 
in the test, one finds a large number of m 
row factors, the majority of which seems ts. 
be based upon shared common Statement 
Furthermore, the correlations are low betw he 
items which are supposed to measure is 
Same variables. Even the same item repea Jä- 
later in the test results in a median corre“ 
tion of only .50, oaii 

In the opinion of the authors the a ee 
the EPPS to give the expected factor ee 
stems from: (a) using the same item ae 
ment in several different items, (b) oe 
the same item on two scales, and (c) yi 
the forced-choice item form with equated 
cial desirability of the item statements. nts 
first practice introduces significant ane 
of overlapping specific variance and a 
the sampling of trait indicators. The “ae 
Practice also introduces overlapping varia pe 
into the scale scores. No item should aps 
Scored on more than one scale, unless perh 
aS a suppressor variable. It is difficult 
two scales to be independent of one ano 
if they share the same items. diff- 

The third and perhaps most serious ms 
culty lies with the use of forced-choice ite 
The basic form of the PPS item is one 


A Statistical Evaluation 


encourages low reliability of response. The S 
must choose which of two statements seems 
more descriptive of himself, yet the choice is 
made more difficult by equating the state- 
ments for social desirability. Sometimes the 
choice is difficult because two statements 
seem about equally applicable. At other times 
the choice is difficult because the two state- 
ments seem about equally inapplicable. The 
test situation tends to maximize the number 
of difficult, and hence unreliable, choices for 
the S. Even for a conscientious respondent, 
it is difficult to be accurate and consistent 
under such circumstances. Less careful indi- 
viduals easily develop a negative attitude to- 
Ward the test situation, which promotes care- 
lessness, further reducing the reliability of 
response, 

The PPS has adopted this forced-choice 
form for the purpose of avoiding respondent 
tendency to present a good picture of himself. 

hereas this is a laudable objective, it does 
Not seem to have been attained without ex- 


359 


cessive cost, if at all. Item form should make 
it as easy as possible for the respondent to ex- 
press himself and his position as exactly as 
possible, truthfully or not. Whether or not 
the individual is answering truthfully, or giv- 
ing himself the benefit of the doubt, should 
be determined by other methods and this in- 
formation used in evaluating the test results. 
Attempts to force truthfulness by special item 
forms seem likely to succeed principally in 
reducing item reliability and validity to the 
point where the test has questionable utility. 


REFERENCES 


Comrey, A. L., & Levonran, E. A comparison of 
three point coefficients in factor analyses of MMPI 
items. Educ. psychol. Measmt., 1958, 18, 739-755. 

Epwarps, A. L. Edwards Personal Preference Sched- 
ule Manual. New York: Psychol, Corp., 1954. 

Kaiser, H. F. The varimax criterion for analytic 
rotation in factor analysis. Psychometrika, 1958, 


23, 187-200. 
(Received July 10, 1958) 


Journal of Abed | 


sychology 
Vol. 43, No. 6, 1959 


THE PREDICTIVE EFFICIENCY OF TEMPERAMENT 
CHARACTERISTICS AND PERSONAL HISTORY 
VARIABLES IN DETERMINING SUCCESS 
OF LIFE INSURANCE AGENTS? 


PETER F. MERENDA axb WALTER V. CLARKE 


Walter V. Clarke Associates, Inc. 


One of the most serious and persistent 
problems confronting the life insurance indus- 
try is that of recruiting and selecting agents 
who will become successful “career” salesmen, 
Current attrition rates among new hires are 
close to 75-80% at the end of a three-year 
period. This great turnover of personnel 
represents considerable costs to the industry 
in the continual recruitment, training, and 
financing of new agents, and in the financial 
support, over a period of time, of agents who 
are not likely to sell sufficient premiums to 
cover the company’s investment in them. 

The present study is part of a long range 
investigation by one company conducted for 
the purpose of evaluating existing screening 
procedures in the selection of life insurance 


agents and in developing efficient predictors 
of likely failures, 


Subjects 


Ss for this study we 
agents who were hired 
September 1, 1950 and 


agents were all financed by the company (i.e, hired 


This sample com- 


For all, at least a three-year pe- 
riod had elapsed since the time of hire at the date 
the study was designed in early 1958. All 522 agents 
were originally selected from a large applicant popu- 
lation on the basis of results obtained on the Ac- 
tivity Vector Analysis (AVA) and Personal history 
data reported on a confidential questionnaire, The 
selections were made, however, on a rather informal 
basis since no definite criterion cut-offs were set, 
Only guide lines were established for the general 
agents who, in the final analysis, exercised their pre- 
rogatives as to whom to hire or not hire, Neverthe- 
less, there is reason to believe that the Personality 
profiles were given more weight than the personal 

1 Further acknowledgment is gratefully made for 
the helpful suggestions and criticisms given by Alice 
L. Palubinskas of the Psychology Department, Tufts 
University, who was research consultant on this study, 


history variables. The underlying reason for this a 
sumption is that an integrated personality profile aS 
termined to be “best” for life insurance a S 
being used by the company during the perio! vòi 
which the subjects were hired whereas no such Lae, 
file was determined for the personal history ae 
ables. A survey (Walter V. Clarke Associates, ite 
Staff, 1955) of the general agents attested to ned 
relative weights of these two sets of data ee y 
by them in their consideration of new apple iy 
As a result the subjects of this study are more hena 
restricted in their range of personality profiles 
they are on the personal history measures. 


Predictors 


The two sets of predictors of success as life nae 
ance salesmen employed in this study were the Lig 
sonality inventory and certain personal history naira 
ables obtained from a locally prepared question is 
as indicated above. The Activity Vector oer? 
(AVA) is a self concept personality assessmen clas- 
strument. It is widely used in industry in the Is 0 
sification and selection of personnel at all leve an 
employment. The details of the construction, y 
application of the AVA have been publisher on 
Clarke (1956b). Reliability and validity ager 
this inventory have been reported by Clarke (1 an 
1956c), Hammer (1958), Lundin (1957), Mer. 
(1959a; 1959b), Musiker (1958), and Whisler (1 ol- 
The personal history questionnaire covers ws ex- 
lowing 20 areas of vital statistics, training a" mber 
perience: (1) age, (2) marital status, (3) ri evel 
of children, (4) military status, (5) educational 
(6) Percentage of educational expenses earnec, 
number of organizations of which a anon De 
number of offices held, (9) number of yea y 
previous work experience, (10) length of e 
present residence, of unh 


(11) dollar amount anding 
monthly income, (12) dollar amount of outs sed 
debts, (13) 


dollar amount of life insurance preety 
for self, (14) dollar amount of minimum me non 
living expenses, (15) attendance at sales and/or 7 
sales courses, (16) employment status of ny 8) 
type of recreation in which normally engak ia of 
Previous sales experience, (19) total num nd 
friends, (20) number of friends in professiona ae 
executive/managerial class. These data were tof 
corded on the date of application for employ als? 
cach of the agents of this study. The AVA W 
administered to each at that time. 


a 


2 e y. 


Predictive Variables in Success oj Life Insurance Agents 361 


Criteria 


For the purpose of evaluating the prediction va- 
lidities of the AVA and the personal history vari- 
ables as selectors of life insurance salesmen, the fol- 
lowing criterion standards were set for each subject 
at the expiration of his third year after first employ- 
ment. 


Success 


A successful agent is one who: 

1. Meets his Training Allowance Program quotas 
or achieves $200,000 production in his first year and 
at least $300,000 in cither his second or third year; 
or 

2. Is advanced to a supervisory or management 
Position within the company; or 

3. Leaves the company to become an agent, su- 
Pervisor, or general agent of another company be- 
fore the end of the third year if he achieves the pro- 
duction goals outlined above. 


Failure 


An unsuccessful agent is one who: i 

1. Fails to reach the production goals outlined, 
Whether or not he remains as an agent with the com- 
Pany; or 

2. Has had his contract terminated by the com- 
Pany; or 

3. Leaves the insurance industry within a three- 
Year period. 

The sample of 522 agents was dichotomized on 
these three-year criterion standards. A total of 414 
agents were classified as “unsuccessful” and only 108, 
as “Successful.” It will be noted that a very high 
failure rate (4 out of 5) exists for the agents of this 
Study, 

A further criterion in the form of new bus 
Volume (face value of insurance policies) at the end 
of the first, second, and third years was available 
Or these agents. 


iness 


Procedure 


The AVA and the personal history variables 
Were studied separately as to their relative 
Predictive efficiencies in determining the suc- 
Cess or failure of the life insurance salesmen 
of this study. The results obtained by em- 
Ploying the AVA alone have been published 
aS a separate report (Merenda, 1959a). For 
both sets of predictors discriminant analysis 
vas applied to the problem of providing 
Maximum separation between the successful 
“nd unsuccessful agents of the sample of 522 
Ue insurance salesmen. The resulting dis- 
“timinant functions were tested for statistical 
Significance and the empirically determined 
Heights for each battery in this initial valida- 
‘on study were used to build relative fre- 


quency distributions of discriminant scores 
for the purpose of establishing definite “go”— 
“no go” criterion cut-offs. 

The four-factor AVA resultant profile was 
used as the temperament measure set. For 
the personal history set only 5 of the 20 vari- 
ables were used in the battery. These five 
were the only ones which individually differ- 
entiated the two groups. They are: (1) num- 
ber of children, (2) educational level, (3) 
number of offices held, (4) dollar amount of 
minimum monthly living expenses, and (5) 
dollar amount of life insurance purchased for 
self. For the other fifteen personal history 
measures, the frequency distributions for suc- 
cessful and unsuccessful agents were either 
nearly completely overlapping or else the dif- 
ferences in means and variances were not sta- 
tistically significant. 

Pearson-type correlation coefficients were 
computed between the four individual AVA 
resultant variates expressed as ordinary stand- 
ard scores with mean = 50 and o= 10 and 
the five personal history measures converted 
to normalized T scores. Comparisons were 
also made of the relative efficiencies of the 
individual variates in predicting the di- 
chotomy. Finally, an analysis was made of 
the power of the two sets, used as independ- 
ent screens and employing variable cut-offs, 
in rejecting likely failures as life insurance 
salesmen at the end of three years after first 


hire. 
Results and Discussion 


The results of discriminant analysis of the 
AVA profile data are summarized in Tables 1 
and 2. Resultant AVA profile shape was 
used as the predictor variable. Standard 
scores for the four principal AVA vectors 
were transformed to deviation scores from 
the composite mean in order to remove the 
effect of Activity level (total number of 
words checked). Hence, the profiles consti- 
tuting the set of discriminant variates in this 
study were expressed as sets of deviations 
about the individual S’s mean. 

The data of Table 1 show that the suc- 
cessful agents possessed significantly higher 
scores on AVA Vectors 1 and 2 (aggressive- 
ness and sociability) and significantly lower 


scores than the unsuccessful agents on AVA 


362 Peter F. Merenda and Walter V. Clarke 
362 
Table 1 
Comparison of AVA Resultant Profiles (4 Vectors) for Three Year Successful and Unsuccessfu 
£ z a e 
Life Insurance Agents o epa a 
ful Agents Unsuccessful Agents B th 
Sueo (V = 414) (X = 522) > 
Variate X Oz x Gz x O; Fp t 
T 198: J277 8.65429 1 3 <.01 
v 6.88785 8.29838 4.59277 8.65429 tlas A 201 
V-2 10.81308 7.43325 8.9457 8.11598 +1 18 Bf Zoo 
v3 ~8.81308 $ 6.80722 6.74772 7.22222 6.76171 —.150 346 E 
V-4 —9.48598 5.96032 —7.27228 6.24560 —7.72989 6.25280 —.179 415 i 


Vectors 3 and 4 (emotional control and so- 
cial adaptability). This pattern differentia- 
tion is consistent with the hypothesized “best” 
profile for life insurance agents. Table 2 
gives the appropriate discriminant weights 
for maximizing the separation between the 
two groups. The analysis of maximum sepa- 
ration yields an F value which is statistically 
beyond the .05 level. 

The results of discriminant analysis of the 
set of personal history variates are summa- 
rized in Tables 3 and 4. The data of Table 3 
show that with the exception of “number of 
offices held” in which the difference is not 
statistically significant, the successful group 
possessed significantly higher scores on these 
measures than did the lower group. The non- 
significant variable was retained because it 
showed relatively high differentiation at the 
end of one year and it was felt that if in- 
cluded as a predictor in the three. 
sis it would contribute, if only sli 
prediction when combined wit 
variates in the battery. 

Table 4 gives the appropriate discriminant 
weights for maximizing the separation be- 


-year analy- 
ghtly, to the 
h the other 


Table 2 


ant 
108) and 
Life Insurance Agents 


Dis- 


Difference in Means criminant 


Variate (Successful-Unsuccessful) Weights p P 
vl +2.29508 +.000018 

v-2 +1.86851 +.000045 s 4, a 
V-3 — 2.00586 —.000058 °:!/ <.05 
V-4 — 2.21370 


—.000020 


tween the two groups. The analysis of pat 
mum separation yields an F value whic a 
statistically significant beyond the 001 he 
Hence, from the data of the preceding ta oe 
it would appear that both sets do poss y 
dividually significant validity in prediche 
the success or failure of life insurance aare 
at the end of three years. The nest pro fi- 
was to determine whether the predictive © as 
ciency would be enhanced if these Lees 
were combined into one battery or when 
their use as two independent predictors WO 
Yield more efficient results. The anene igs 
this question was forthcoming from a cont in- 
tion matrix * which revealed the relative ese 
dependence of the individual variates of t i 
two batteries. The correlational data SHOV 
that the personal history measures ate ples 
only independent of the personality vaT eri 
but are also uncorrelated with each sided 
On the basis of these findings it was dee 
to employ these two predictor sets aS smut 
pendent batteries each with its own mim! 
cut-off score. q for 
Discriminant scores were calculated 


jghts Te 
each of the 522 Ss employing the weight dif- 
ported in Tables 2 and 4. Then, - used, 


ferentiation is independent of the unit ‘alti 
the weighted composite scores were = yer 
plied by the constant 10,000 in order 
move unnecessary decimal places. _ 
tributions of these linear composite al 
*The correlation matrix of resultant ae ES 


Standardized personal history variables for A 
ple have been deposited with the America 
todupli 


mentation Institute. Order Document N0. 
ADI Auxiliary Publications Projects, en 
tion Service, Library of Congress, Washing 
remitting in advance $1.25 for photocople® ple 65- 
for 35-mm. microfilm. Make checks p Cong" 
Chief, Photoduplication Service, Library © 


ae 


a 


Predictive Variables in Success of Life Insurance Agents 363 
Table 3 
re Distribution and Serial Correlation Statistics for Standardized Personal History Predictor Variables of 
` Three Year Successful and Unsuccessful Life Insurance Agents 
Successful Agents Unsuccessful Agents Both 
(N=108) (V=414) (V=522) 
Variate x Or XxX Oz £ or tp t p 
1. Number of Children 51.93519 8.32694 49.75604 8.79810 50.20690 8.71444 : 4.126 2.91 <.01 j 
2. Educational Level 51.47222 8.521727 49.55556 9.51455 49.95211 9.35011 +.104 238 <.01 
3. No. of Offices Held 51.48148 8.37820 50.53865 8.19579 50.73372 8.24270 +.058 1.33 >.05* 
+. Monthly Living 52.79630 9.37366 49.43237 8.99429 50.12835 9.17586 +.185 4.30 <.001 
Expenses 
5. Amount of Insurance 52.4444 10.54827 49.20531 8.95557 49.87548 9.39949 +.174 4.03 <.001 


* Not significant. 


representing the AVA profiles yielded a mean 
of 13.08 and SD of 8.15 for the successful 
group, and a mean of 10.16 and SD of 8.02 
for the unsuccessful groups. For the total 
sample, the AVA discriminant scores ranged 
from —12 to +36. The linear discriminant 
score distributions for the personal history 
| variates yielded a mean of 101.20 and SD of 
f 11.20 for the successful group, and a mean of 
95.63 and SD of 9.24 for the unsuccessful 
| group. For the total sample, the personal his- 


| 


tory discriminant scores ranged from 72 to 
124, 
The discriminant score distributions were 


' Brouped in regular intervals of 10, for each 


acceptance was calculated for each interval. 
hese data are presented in Table 5. It will 


be noted that as the AVA scores begin to rise 
crimi- 


above 10 and the personal history discrimi 
ere is 


nant scores become greater than 100 th 


a continued increase over chance (P = .20) 


On the 


m the probability of acceptance. 


other hand, as the AVA scores become nega- 
tive and the personal history scores go below 
90, the probabilities gradually diminish to sig- 
nificantly less than chance. These data sug- 
gest that the major discrimination by the 
AVA occurs in the negative discriminant 
score range. This result is as expected since 
Ss with extremely low scores are those pos- 
sessing profiles which are incompatible with 
the AVA pattern hypothesized to be “best” 
for life insurance salesmen. 

In determining the maximum predictive effi- 
ciency of the AVA and personal history meas- 
ures, various cut-offs were tried. The result- 
ing data were analyzed both from the stand- 
point of percentage of successful and unsuc- 
cessful agents who would have been rejected 
by these standards and gross new business 
made by these agents over the first three-year 
period after hire. These data are presented 
in Tables 6 and 7. The data of Table 6 re- 
veal that a cut-off score of zero for the AVA 
is a highly efficient standard since it would 


Table 4 


Five-Variate Discriminant Analysis Dat 


Successful (N = 108 


a for Standardized Personal History V: 
) and Unsuccessful ( 


ariables of Three Year 
N = 414) Life Insurance Agents 


Of the predictor sets and the probability of 


Difference in Means 


Discriminant 


Variate (Successful- Unsuccessful) Weights F P 
1. Number of Children +2.17915 po 
2. Educational Level +1.91666 api 
3. No. of Offices Held +0.94283 +.000014 5.54 <.001 
4. Monthly Living Expenses pier eS 
5. Amount of Insurance Owned +3.23913 py 8 
ae Beaks T ee = E 


364 


Table 5 


Probabilities (of Acceptance) Based upon Linear 
Discriminant Function Analysis of AVA and 
Personal History Data for Validation 
Sample of Life Insurance Agents 


(N = 522) 


Discriminant 


Probability 
Score Interval 


of Acceptance* 


AVA Personal History AVA Personal History 
30-39 120-129 40 -50 
20-29 110-119 30 34 
10-19 100-109 23 -28 
0-9 90-99 18 aay 
—10 to — 1 70-89 205 AS 
—20 to —11 70-79 -00 .00 


a The level of chance is at .20 since the successful-unsuccessful 
split was in the ratio of 1 to 4. 


have rejected 39 unsuccessful agents at the 
sacrifice of only 2 successful agents. When 
combined with personal history cut-off scores 
of 88, 91, and 92 the set remains a highly 
efficient predictor system since both reject a 
relatively small proportion of agents who 
were ultimately successful, and the incidence 
of overlap among the relatively high propor- 
tion of unsuccessful agents rejected is quite 
low. Inspection of the data of Table 7 dis- 
closes that the 121 agents rejected by the AVA 
cut-off of zero and personal history cut-off of 
88 produced over a three-year period a total 
of only $31,626,000 in gross new business or 
an average of only $87,100 per year. This 
amount of production is considerably below 
the $200,000 first year and $300,000 post- 
first-year standard established by the agency 
department of the company conducting this 


Peter F. Merenda and Walter V. Clarke 


study. Due to the complex nature of the 
problem, it is not possible to determine pre- 
cisely or even estimate closely the net loss to 
the company occasioned by the employment 
of these 121 agents. However, when one 
stops to consider the initial training costs 
and the salaries paid to these financed agents 
for the periods of their employment with the 
company it becomes apparent that the reve- 
nue brought in by these salesmen was consid- 
erably less than the company expense of plac- 
ing and keeping them on the payroll. 

For cut-off of Aaya = 0 and Apn = 91, there 
is a substantial rise in the number of unsuc- 
cessful agents who would have been rejecte 
(45 or 11%). This increment is accom- 
panied by a somewhat smaller rate of change 
(9 or 8%) for the successful group. From 
the standpoint of new sales made over a CE 
year period by these agents who would n° 
have been hired had these standards been a 
forced at the time of hire, the 175 agen’ 
showed gross new business of $47,914,000. 
This compares with a total of $196,341,70 
worth of life insurance sold by the full sam- 
ple of 522 agents over this same penny 
Hence, for the first two sets of cut-offs 16% 
and 24%, respectively, of the total po 
were attributable to 23% and 33% of Pa 
agents of the study. Table 7 shows, POY 
ever, that as the cut-off scores are increas? 
beyond this point, loss in total revenue (43 1 
for Aava = 6, Apn =92) would be substa 
tial and presumably too great to be recoupê 
entirely through more efficient standards ° 
selecting salesmen. Accordingly, the set ; 
ava = 0, Apr = 91 was determined the M° 
practical and efficient, 


Table 6 


Predictive Efficiency of Combined AVA and Perso: 
Life Insurance 


nal History Variables in the Selection of 
Agents 


ee 
Number of Agents Failing to Meet Standards 
AV a andards 

Discriminant Score A Alone PH Alone AVA + PH Percentage Rejected by These Standa! 

Cut-offs for Ss U 5 X S U 3 Yrs. Successful 3 Yrs. Unsuccessful 
AVA (0) PH (88) 239 8 © tom 

m 9% 27 

AVA (0) PH (91) 2 39 17 134 19 156 Re ri 
AVA (0) PH (92) 2 390 20 143 22 168 200, MY 
AVA (0) PH (92) 11 6l 20 143 30 182 28%, 44% 
AVA (0) PH (92) 21 118 20 143 39 223 36% 54% we 


; 


y 


f 


| 


Predictive Variables in Success of Life Insurance Agents 


Table 7 


Paid Production of Life Insurance Agents Failing to Meet Varying Selection Criterion Standards 


Discriminant 


Paid Production* 


Score Cut-otis Category N Ist Yre 2nd Yr. ard Yr. Total 
AVA (0) PH (88) Successful 10 2,740 4.531 6,028 13,299 
Unsuc ll 9,310 5.479 3,538 18,327 
Total 121 12,050 10,010 9.566 31,626 
AVA (0) PH (91) Successful 19 4,775 8,533 8,715 22,023 
Unsuccessful 156 13,083 3 4,935 
Total 175 17,858 16,406 13,650 
AVA (0) PH (92) Successful 22 5,695 10,310 11,203 
Unsuccessful _ 168 14,246 5 5,383 
Total 190 19,941 5 16.586 
AVA (2) PH (92) 30 14,409 15,237 
ful 182 9,616 5,860 
212 24,025 21,097 
AVA (6) PH (92) 39 12,076 17,218 18,273 47,567 
Unsuccessful 223 18,948 11 918 6.983 37,849 
Total 262 31,024 29,136 25,256 85,416 


* In thousands of dollars, 


Among the 15 remaining personal history 
Variables which proved not to be individually 
discriminative in terms of differences in over- 
all frequency distributions, the factor of age 
appeared to possess some significant discrimi- 
Natory power at both extreme levels. The 
data showed that persons who are near or 
Over age 45 and near or below age 25 at the 
time of hire are not likely to succeed as life 
insurance salesmen at the end of three years. 

Y setting the age limits, 25 > Yrs. > 45, an 
analysis was made of the number of addi- 
tional agents who would have been rejected 

Y these criterion standards and who would 
ave successfully passed the two other screens. 
t was found that with the criterion scores of 
Ava = 0, App = 91, which were judged to 
be the most efficient, an additional 32 unsuc- 
cessful agents would have been rejected at 
1e sacrifice of only 3 successful agents. The 
new business volume of these 35 individuals 
Over a three-year period was $10,554,000 or 
an annual average of $100,419 per agent. 
This figure is far below the accepted stand- 
ard of required performance. Hence, there 
âPpeared to be ample evidence to conclude 
, tt applicants for life insurance agent who 
are less than 25 years or more than 45 years 


old are not likely to succeed in selling life 
insurance over a sustained period of time. 
When the age criterion variable with these 
two critical cut-off scores was combined with 
the multiple cut-off set of Aava = 0, Apo = 91, 
it was found that an increase of 3% of the 
proportion of rejection of successful agents 
resulted but that the rejection of unsuccess- 
ful agents was increased by 7%. When these 
figures are analyzed in terms of total num- 
bers in each group (22 for successful, 188 for 
unsuccessful) the gain in predicted efficiency 
for the total set is highly significant. 


Summary and Conclusions 


Temperament characteristics as measured 
by the AVA and also various personal history 
measures were investigated as to their predic- 
tive efficiency in the selection of life insur- 
ance salesmen. A total of 522 financed male 
agents employed full time in selling life insur- 
ance were studied three years after hire. 

The findings of this study disclose that ap- 
plicants for life insurance agent are not likely 
to be successful in selling life insurance over 
a sustained period of time if, temperament- 
wise, their self-perceptions are as passive and 


366 


submissive individuals rather than aggressive 
and socially confident persons. 

The findings also reveal that personal-so- 
cial data are only of limited value when con- 
sidering these variables as predictors of suc- 
cess of life insurance agents. However, 5 of 
20 of these measures proved to be good dis- 
criminators of successful-unsuccessful agents 
when combined in a battery. One of the per- 
sonal history variables, age, showed that it 
could be used, individually, as a predictor of 
failure in life insurance selling for those ap- 
plicants whose ages are below and above cer- 
tain minimum and maximum levels. 

The data of the study also show that tem- 
perament characteristics, as measured by the 
AVA, and the discriminating personal history 
variates are uncorrelated, thereby making it 
possible to establish independent screens for 
selecting life insurance salesmen. They fur- 
ther point to the predictive efficiency of these 
personality and personal measures in deter- 
mining success or failure among the agents of 
the study, and suggest criteria to be evalu- 
ated when considering the employment of ap- 
plicants to this position. 

The following conclusions are held to be 
tenable from the data of this study: 


(a) The AVA is a valid predictor of suc- 
cess-failure among life insurance agents. 

(6) Certain personal history measures are 
valid predictors of success-failure 
among life insurance agents, 

(c) Combining AVA and Personal history 
data enhances the predictive efficiency 


Peter F. Merenda and Walter V. Clarke 


of these measures in determining the 
success or failure of life insurance 
agents over a sustained period of time. 


REFERENCES 


Crarke, W. V, Assoctates, Inc. A study of the 
attitudes of general agents in the field of life in- 
surance to Activity Vector Analysis. Unpublished 
manuscript. Walter V. Clarke Associates, Inc. 
E. Providence, R. I., 1955. le 

CLARKE, W. V, Personality profiles of self-mac 
company presidents. J. Psychol, 1956, 41, 413- 
418. (a) , ‘al 

Crarke, W. V. The construction of an dustha 
selection personality test. J, Psychol, 1956, 41; 
379-394. (b) ps) Bie 

CLARKE, W. V. The personality profiles of life a 
surance agents. J. Psychol, 1956, 42, 295-302. 
(c) ivi 

Hanter, C. H. A validation study of the Activity 
Vector Analysis, Unpublished doctoral disserta 
tion, Purdue Univer., 1958. ivity 

Luxpis, W. H. A clinical evaluation of the Activi A 
Vector Analysis test, Unpublished manuscript- 
T. W. Franks and Associates, Chicago, 1957. A 

Merenpa, P. F, & Crarxe, W. V. AVA as a pre 
dictor of occupational hierarchy, J. appl. Ps) 
chol., 1958, 42, 213-218. (a) 

Merenpa, P. F., & Crarke, W. V. 
for life insurance salesmen. 
chol., 1959, 1, 1-11, (a) idity 

Merenpa, P. F, & Crarke, W. V. AVA vali 43 
for textile workers. J, appl. Psychol, 195% *® 
162-165. (b) Hië 

Musiker, H. R, & Crarke, W. V. The ces, 
validity of Activity Vector Analysis. Psyc” 
Rep., 1938, 4, 435-438, (a) idity 

Wuister, L, D., A study of the descriptive validi) 


7. 
of Activity Vector Analysis. J. Psychol, 1957 
43, 205-223. 


AVA validity 
Engng. industr. PSY 


(Received October 17, 1958) 


TÜ ee 


Journal of Applied Psychology 
Vol. 43, No. 6, 1959 


THE RELATIONSHIP OF MESSAGE UNITY (“PULL”) 
TO THE RECIPIENT’S RESPONSE POTENTIAL? 


CHARLES MARGOLIS 


The Bruthers Company, Cleveland, Ohio 


Anp CHARLES R. PORTER? 


Howard University 


Much current research in human communi- 
Cation relates to group process, information 
theory, content analysis, and the psychophysi- 
cal study of messages. Little has been pub- 
lished on the relations between defined prop- 
erties of Persuasive mesages (the words) and 
the attitude change they influence. 

Ovland, Janis, and Kelley (1953) have 
Teported that emotionality of the message it- 
Self is a significant determinant of attitude 
change in message recipients, in the case of an 
Unpleasant emotion, fear. Lasswell, in Smith, 

asswell, and Casey (1946) has discussed 

eme emphasis and position effects in mes- 
Sages. Doob (1948) has applied learning 
theory Concepts to propaganda and public 
Opinion, 

This study tests one theoretical deduction 
rom a general quantitative theory of the re- 
lationships between message properties and 
Properties of the message recipient's behavior. 

Ne hypothesis is that response potential is a 
unction of message unity. 


Definitions 


Message unity, approximately, may be 
«Ought of as the total persuasive effect or 
Pull” of the message. Precisely character- 
ed, Message unity is the sum of the message 
Go feeling-tones, each weighted by the 
©me’s emphasis in the message and the 
reciproca] of the theme’s similarity to the 

Ost emphasized theme. The definition of 

€ssage unity is based on a rational judg- 


+ . 
Or} A More complete account may be found in the 
iea .A. thesis of the first named author, on 


2 åt Western Reserve University, Cleveland, Ohio. 
n p R. Porter served as faculty advisor at West- 
Stug, serve University, Cleveland, Ohio, for this 
oy he first named author wishes to acknowl- 
out tls invaluable guidance and assistance. With- 
its b this study could not have been completed in 
< Present form: 


ment of the authors, grounded partly in the 
work of the above quoted authors. 

Response potential is the tendency of the 
message recipient to perform the action to 
which the message would persuade. Response 
potential is the score on a questionnaire whose 
items are weighted in terms of the correlation 
between the mean response potential for a 
given message and the number of individuals 
induced to act as the message requests. 

A message as here defined is any product 
of language behavior. The message can be 
divided into rhetorical elements called themes. 
Each theme is a concept embodied in one or 
more sentences of the message. These three 
classes of theme properties may be identified: 

1. Theme feeling-tone is the rated intensity 
of the motive evoked by the theme. 

2. Theme emphasis is the rated degree of 
attention demanded by the theme. 

3. Theme similarity is the rated extent to 
which a given theme supports the most em- 
phasized theme in the message. Message 
unity is thus defined in terms of the three 
sets of theme properties. 


Method 


To test the research hypothesis, that response po- 
tential is a function of message unity, three messages 
with widely disparate levels of message unity were 
disseminated by direct mail advertising. The mes- 
sages, which requested the reader to purchase vita- 
mins, were written on the basis of the generalized 
motive intensities of a sample of 107 vitamin pur- 
chasers, who were asked to rank a series of motives 
for self-importance. Out of the three messages there 
were abstracted critical word groups, each embodying 
at least one theme, Thirteen themes were abstracted. 
Four themes appeared in Message A, 10 in Message 
B, and 13 in M E 

To determine m ge unity, four raters first evalu- 
ated theme emphasis, theme similarity, and theme 
feeling-tone. Message themes were exemplified to 
each rater by means of phrases expressing each theme, 


307 


368 


Table 1 


Ini tenti Number 
S Unity, Mean Response Potential, and 
sae Vitamin Orders for Three Messages 


Number of 
Vitamin 
Orders 
Mean per 1,000 
Message Response Messages 
Message Unity Potential* Mailed 
A 16.20 14.07 34 
B 70.59 16.35 47 
C 90.16 16.39 58 


a One response potential questionnaire was received with 
each order, hence number of questionnaires per 1000 messages 
is identical with number of orders. 


Emphasis, similarity, and feeling-tone of themes were 
defined for the raters by examples. Raters were 
asked to assign to each theme in each message a 
number from 1-10, first for emphasis, then for simi- 
larity, then for feeling-tone, Judgments were to be 
made in accordance with rating criteria suggested by 
Guilford (1954). 

To eliminate biases, the ratings were adjusted ac- 
cording to methods which Guilford (1954) suggests, 
to correct for leniency and halo error, rater-theme in- 
teraction and rater-message interaction. After ad- 
justment, the ratings of the four judges were aver- 
aged. The reliability of unadjusted and adjusted 
ratings was assessed by the average rank-order in- 
tercorrelation between raters, From the mean ad- 
justed ratings, message unity was calculated in terms 
of the definition that Message unity is the sum of 
theme feeling-tones, each weighted by theme empha- 
sis and reciprocal theme similarity, 

The three messages were sent to 3,000 vitamin buy- 
ers, divided into three equal groups. Names of pur- 
chasers were procured from a commercial list of 
previous vitamin buyers. The 3,000 names were re- 
ceived on 8.5” by 11” sheets of gummed labels with 
sixty names per sheet. Systemmatic sampling was 
performed by cutting each sheet of labels into three 
parts of twenty labels each. The parts were then 
shufñed like a deck of cards and the resulting pile of 
partial sheets was separated into three piles by deal- 
ing from the top in serial order, Thereby the entire 
3,000 names were each assigned to one of three groups. 

Response potential was construed as the score from 
a questionnaire accompanying the messages. It con- 
sisted of a series of forced-choice questions, presumed 
capable of detecting tendency to act (here, to buy 
vitamins). The alternatives of each question were 
weighted according to a method proposed by Guil- 
ford (1954), in which empirical weights were as- 
signed based on proportions of response and vali- 
dated in terms of mean response potential for a 

message and the number of vitamin orders the mes- 
sage produced. 


Charles Margolis and Charles R. Porter 


Results and Discussion 


Reliabilities, as measured by mean rater 
intercorrelations for adjusted themes, from 
which message unity was calculated, ranged 
from 0.50 to 0.87. Validating correlation be- 
tween mean response potential for a given 
message and the number of vitamin orders for 
the message was found to be 0.95. Basic data 
are given in Table 1. 

Product-moment correlation coefficient be- 
tween individual response potential and the 
three values of message unity was 0.57, which 
was statistically greater than zero at a ¢ value 
significant beyond the 1% level of confidence. 
A linear relationship was established betwee? 
individual response potential and message 
unity. The relationship is expressed by the 
equation: 


U, = 0.036 U,, + 13.6 


where U, is individual response potential and 
Um is message unity. By setting confidence 
intervals, it was discovered that the slope 
ranged between 0.017 and 0.055 and the in- 
tercept between 12.2 and 14.9. The slope 
was shown to be statistically different iror 
zero by a £ value significant beyond the 17 
level of confidence. An analysis of vaime 
technique revealed statistically insignificant 
deviations from linear regression for ind! 
vidual response potential on message unity: 
(Deviations from linear regression < 0.10.) r 

Statistically, response potential is a Ia 
function of message unity, a fortiori there 3 
some relationship between the two variab Ee 
and verification is lent both to the resear? 
hypothesis and the theory from which it is € 


: S 0 
rivable. Broader substantiation is needed; 
course, 


Received October 1, 1958, 


REFERENCES New 
Doon, L. W. Public opinion and propaganda. © 
York; Holt, 1948, 1 ed.) 
G FORD, J. P, Psychometric methods. (20® 
New York: McGraw-Hill, 1954. con 
Hovranp, C. I, Janis, I. Lọ, & Keniry, H. H- “Gate 


munication and persuasion. New Taven: 
Univer, Press, 1953. wa 

Smirn, B. L, Lasswett, H. D, & CASEY, into” 
Propaganda, communication and public oF 


Princeton: Princeton Univer. Press, 1946. 


| 


Journal of Applied Psychology 
Vol. 43, No. 6, 1959 á ý 


EFFECT OF SUBLIMINAL CUES ON TEST RESULTS: 


HEBER C. SHARP 


Utah State Universit y 


Recent assertions that presenting visual 
stimuli at exposure speeds in excess of con- 
Scious perception to influence behavior has 
caused widespread interest and some concern 
(Brooks, 1957; Cousins, 1957). McConnel, 
Cutler, and McNeil (1958) have reviewed re- 
cently the work in subliminal perception and 
Point to the lack of experimental evidence 
Supporting this technique as a means of in- 

uencing behavior. Klein (1955) and his co- 
Workers found that different sexual and sym- 
bolic figures exposed subliminally provoked 
different impressions of consciously perceived 
Pictures of people. Smith and Henriksson 
(1955) have demonstrated that subliminal 
Stimuli affect measurably conscious percep- 
tion, These carefully controlled experiments 
Nvolve presenting stimuli to individuals sub- 
'minally, Presenting stimuli to groups in 
Which individual differences are not controlled 
would be similar to the conditions experienced 
Y advertisers, This research is concerned 
With the extent to which individuals might be 
fluenced in educational achievement. Sub- 
inal cues suggesting correct answers to test 
questions should increase test scores and 
Conversely subliminal cues which misinform 
Should affect test scores adversely. The need 


° do well on tests should lower the threshold 
Ss, 


Method 


Sixty-two Ss were drawn from a section of cle- 
entary general psychology and were divided into 
ee Sroups on the basis of sex and upon scores 
Thed on the first major test. Two Ss, one in each 
Stroup, failed to complete the experimental work and 


ee dropped. Three filmstrips of 50 test items 
ie Were constructed from material in the text. All 


ties Were multiple-choice items with four alterna- 

ben Choices, Slides were made representing the num- 

aad of the alternative choices. Two standard slide 

senp tors were employed. One was used to pre- 

id the test items on the filmstrip and the other was 

Corr to present tachistoscopically the correct or in- 
"ect alternative choice of the test item. 


lM: a 
tag research supported by the University Re- 
h Fund, Utah State University. 


A Graphex shutter made by Wollensak was adapted 
to one slide projector to present the subliminal cues. 
The accuracy cf the shutter was checked with a 
Hewlett-Packard Electronic Counter, Model 522B. 
With the shutter set at 1/200 sec. the actual exposure 
time was 11.84.08 msec. Each test item was ex- 
posed from 25 to 35 sec. depending on the length of 
items. The subliminal cues were superimposed upon 
the test item 3 times, at approximately 7-sec. inter- 
vals, with the shutter speed set at 1/200 sec. and 
shutter opening at F32. 

Filmstrip No. 1, the first major test, furnished 
one criterion for grcup placement. Blank slides were 
used in the second prejector to keep light, noise, and 
other activity associated with the operation of pro- 
jectors constant. At cach test session the filmstrip 
was presented twice, answer sheets were collected 
after each presentation. Filmstrip No. 2, the second 
major test, was shown first with blank slides used 
in the second projector; and during the second show- 
ing, slides with the number of the correct alterna- 
tive choice were superimposed upon the test item as 
subliminal cues. This was used as a check to deter- 
mine whether or not the greups differed significantly 
in ability to respond to the subliminal cues. Film- 
strip No. 3, the third major test, was shown first 
with blank slides, and on the second showing, Group 
1 had alternate right and wrong cues presented and 
Group 2 had only right cues presented. 

Ss were told that a new method in test presenta- 
tion was being developed and that they were not to 
discuss any procedures or conditions until after the 
three tests were completed. At the close of each 
testing session Ss were asked to note on the back of 
their answer sheet whether or not anything unusual 
occurred during the test period. They were asked 
to list any cues of a helpful nature or any distrac- 
tion in connection with the test situation. Comments 
concerning the test or test procedure were encour- 
aged. Ss were called in for brief interviews after 
the test sessions. 


Results 


An examination of individual performances 
revealed that 16 Ss, 9 from Group 1 and 7 
from Group 2, were able to respond to the 
cues throughout the experiment. Eight Ss, 3 
from Group 1 and 5 from Group 2, were un- 
able to experience any of the cues. Sixty 
percent, 36 Ss, were unable to respond to the 
subliminal stimuli on Test 2, second showing 
but were able to see similar cues on Test 3, 
second showing. Since a mean score change 


369 


370 


of 5 or more points was significant at the .01 
level an individual score change of 6 or more 
points was considered sufficient to indicate 
influences other than a practice effect. Of 
the 16 Ss reporting an ability to see the cues 
all showed positive gains and 14 showed 
changes of 6 or more points. Of the 8 Ss re- 
porting no ability to see only 2 showed such 
changes. The 36 Ss who reported no ability 
to respond to cues on their first experience, 
5 Ss had score changes of 6 or more points. 
On their second experience all showed changes 
of 2 or more points and 26 Ss had changes of 
6 or more points. 

A comparison of the groups is shown in 
Table 1. Group 2 performed progressively 
less well on the first showing of the filmstrip. 
The reversal of differences on Test 3 between 
the first and second showing likely resulted 
from Ss responding to the wrong answers, 
thus lowering the mean score. 

Table 2 shows a comparison between the 
mean scores earned by the groups on the first 
and second showings of the filmstrip. Experi- 
mental variables were introduced on the sec- 
ond showing of Test 2 and Test 3. The in- 
fluence of the correct cue raised the mean 
scores significantly (p < 001), and when al- 
ternate right and wrong cues were used, 
Group 1, Test 3, no raise in test score oc- 
curred. Sixteen of the 30 Ss had scores within 
2 points of the 25 correct cues presented. No 
evidence was found to indicate a sex differ- 
ence in experiencing the subliminal cue. 


Discussion 


The majority (60%) of Ss learned to re- 
spond to the stimulus after it had been pre- 


Table 1 
Comparison of the Mean Test Scores between Groups 
N = 30 
g M M or 
Group 1 Group 2 Diff, P 
zei I 22 aa ows ae 
i 2nd 28.40 28.10 0.30 
are Ist 27.10 23.80 3.30 10 
as 2nd 32.87 30.77 2.10 
meg E i 26.40 407 03 
= 2nd 30.90 36.23 523 01 


Heber C. Sharp 


Table 2 
A Comparison of Means on First and Second 
Showings of the Test Films 


N = 30 in all cases 


M 
WO SD Dif. P 
Group 1 
lst 28.26 6.54 
Te 14 
Test1 ad 2840 643 ° 
mazit MA NO or i 
e: B . 
Test 2 ond 3287 1253 > 
mara iet 3047 7.60 aiz 
S ex 
Test 3 ond 3090 692 0# 
Group 2 
Ist 27.27 731 ve 
Te: J83 
Test 1 ona 2810 679 9-83 
arp 18 B P a 
st 2 6.97 .001 
Test 2 ona 30.77 1069 %7 
maa lst 2640 831 
Test 3 9.83 .001 
E3 nd 36.23 826 
ee 


sented one or more times, Bricher and 
Chapanis (1953) have found that when 3 
stimulus slightly below the limen was pre 
sented once, the likelihood of its being recog- 
nized on subsequent trials increased. These 
data show such an increase. : 
The tests used in this study were a majo 
factor in determining S’s grade for the course: 
Tt seems reasonable to assume that Ss would 
have a fairly high need to do well on the tests 
although no measure for need achieveme? 
was employed. McClelland and Lieberman 
(1949) have shown that Ss with a high nee 
to achieve had lower thresholds for succes 
Words than Ss who scored low on need achieve" 
ment. 7 
Boswell’s (1958) data shows a significanti” 
Sreater recognition of stimuli at threshold aa 
els on the ascending function of the aP% 
cycle. This may offer an explanation for pe 
who experienced a sudden awareness of alr 
cues, after once recognizing the cue, the like 


hood of recognition of other cues seeme 

be increased. n 
Monnier (1952) and Cobb and paty 

(1952) have shown that as the stimulus is 

tensity decreases, the retino-cortical transh iy 

sion time increases. The use of low inte?’ 


Effect of Subliminal Cues on Test Results 371 


stimuli for subliminal work would, therefore, 
require a longer transmission time than would 
the better illuminated material upon which it 
Was superimposed. The difference in light in- 
tensity of the two stimulating sources—test 
items and the subliminal cue—would necessi- 
tate two transmission rates occurring some- 
What simultaneously over the visual path- 
Ways. This could lead to confusion and may 
tend to make S respond to one or the other. 
The Ss who attended to the test item would 
likely not be aware of the cue and it would 
remain subliminal for them. On the other 
and, the Ss who attended to the cues and 
not to the test items would show test scores 
following the pattern of the cues shown. 

is would explain the tendency of Ss in 

roup 1, Test 3, to respond incorrectly when 
the cue indicated a noncorrect answer. Ss 
May be aware vaguely of the test item and 
May feel that an occasional cue was wrong 
Ut would respond to the cues since it was 
the chosen method of attack. Introspective 
"eborts of Ss in Group 1 tended to support 
US explanation. 

Four possible factors have been suggested 
to explain Ss’ behavior in learning to respond 
5 stimuli which were, for the large majority 
of Ss, subliminal when first presented. These 
actors were: (@) a lowering of the threshold 
àS stimuli were repeated; (b) the influence 

a need to do well seemed to increase a 
oe for cues to help in responding to test 
me hence, a lowering of threshold for those 
ues which aided in recording the answer; 
(c) the likelihood of a cue once presented on 
ite Cbtimal phase of the alpha cycle enhanced 
fer Subsequent recognition; and (d) the dif- 

ence of possible neural conduction rates 
a aiig S to choose to respond either to the 
Ne or to the test item. 


Summary 


Sixty Ss from a general psychology section 
ere divided into two groups. Subject mat- 
Pi tests were projected on the screen (items 
bl filmstrips). During the control periods 
ank slides were used for tachistoscopic pres- 


entation. In the experimental sessions cor- 
rect answers, and in one session alternate 
right and wrong answers were presented sub- 
liminally (11.84 = .08 msec.). The “hidden” 
cue was superimposed upon the item at low 
illumination intensity three times during the 
exposure period of the test item. 

Significant (p < .001) positive changes in 
mean scores occurred when right answers were 
presented as the cue. 

When alternate right and wrong answers 
were presented, 16 of the 30 Ss in Group 1 
received scores within 2 points of the 25 cor- 
rect answers presented. Although 50% of Ss 
recognized that some answers were wrong, 
they tended to record these wrong answers. 

A majority (60%) of the Ss learned to 
perceive consciously the “hidden” stimulus. 

No sex differences were found. 


REFERENCES 


Boswe.t, R. S. An investigation of the phase of 
the alpha rhythm in relation to visual recognition. 
Unpublished doctoral dissertation, Univer. of Utah, 
1958. 

Bricuer, P. D, & Cuapants, A. Do incorrectly 
perceived tachistoscopic stimuli convey some in- 
formation? Psychol. Rev., 1953, 60, 181-188. 

Brooks, J. The little ad that isn’t there. Consumer 
Rep., 1937, 23, No. 1. 

Coss, W., & Morton, H. B. The human retinogram 
in response to high intensity flashes. EEG Clin, 
Neurophysiol., 1952, 4, 547-556. 

Covusrys, N. Smudging the subconscious. 
Rev., 1957, 40, No. 40. 

Kiem, G. S., Spence, D. P., Hott, R. R, & Goure- 
WITCH, SUSANNAH. Preconscious influences upon 
conscious cognitive behavior. Amer. Psychologist, 
1955, 10, 387. (Abstract) 

McC ietranp, D. C., & LIEBERMAN, A. M. The effect 
of need for achievement on recognition of need re- 
lated words. J. Pers., 1949, 18, 236-251. 

McConne1t, J. V., Cutter, R. L., & McNen, E. B. 
Subliminal stimulation. Amer, Psychologist, 1958, 
13, 229-242. 

Mownnier, M. Retinal, cortical, and motor responses 
to photic stimulation in man: Retino-cortical time 
and optomotor integration time. J. Neurophysiol., 
1952, 15, 469-486. 

Smitu, G. J. W., & Henrtxsson, M. The effect of 
an established percept of a perceptual process be- 
yond awareness. Acta psychol., 1955, 11, 346-355 


Saturday 


(Received October 27, 1958) 


Journal of Applied Bsscholory 


Vol. 43, No. 6, 195: 


SIMULATED PATTERNS ON THE EDWARDS 
PERSONAL PREFERENCE SCHEDULE' 


CHARLES F. DICKEN ° 


Counseling and Testing Center, Stanford University 


An important problem in the use of inven- 
tories of interest and personality is the sus- 
ceptibility of the scores to simulation. At 
least three forms of conscious attitude may 
result in scores which fail to represent accu- 
rately the characteristics of the individual: 
(a) deliberate “faking” with intent to deceive 
the test user, (b) response in terms of an ideal 
self concept rather than a candid self-ap- 
praisal, and (c) response in terms of an 
“honest” but inaccurate or uninsightful self- 
assessment. 

One line of approach to the simulation 
problem has been concern for item selection 
and item subtlety (Gough, 1954; Meehl, 
1945; Seeman, 1952; Wiener, 1948), An- 
other approach has been the development of 
validity scores for detecting or counterbalanc- 
ing bias introduced by test-taking attitude 
(Gough, 1952; Humm, Storment, & Torns, 
1944; Meehl & Hathaway, 1946). 

Simulation has been investigated experi- 
mentally by asking Ss to assume a specified 
role in responding to test items. Reviews 
(Gough, 1950; Meehl & Hathaway, 1946) of 
the extensive literature on role playing of 
the “fake good” and “fake bad” dimensions 
identified by Meehl and Hathaway (1946) 
indicate that the validity score approach, 
when available, is reasonably efficient in de- 
tecting these forms of simulation. Role-play- 
ing studies of structured inventory scales or 
patterns relating to specific personality traits 
or interest attributes (Bordin, 1943; Gough, 
1947; Kelly, Miles, & Terman, 1935; 
staff, 1948; Sundberg & Bachelis, 1956; 
Sweetland, 1948; Wesman, 1952) have con- 

sistently found substantial alterations in the 
scores of Ss instructed to simulate, Validity 
scores have ordinarily been unavailable in 
studies of this type, although there js some 


Long- 


1 The author gratefully acknowledges the assistance 
of Ralph Granneberg in obtaining and testing the Ss 
and of John Black in the data analysis, 

* Now at the University of Chicago. 


evidence that they can be effective (Gough, 
1947). 

The most recent line of attack on the poi 
lem of the descriptive accuracy of smietana 
inventory scores concerns what Jackson au 
Messick (1958) have termed “stylistic des 
terminants of item response. Tendencies ba 
acquiesce and to respond in terms of the S05 
cial desirability of the item are two majo! 
instances of stylistic determinants. Jacks? 
and Messick reviewed the experimental Be 
dence and concluded, “. . . stylistic deter- 
minants . . . as distinct from specific coni 
tent, account for a large proportion a 
sponse variance on some personality sealen 
particularly the California F scale, the MMPI, 
and the California Psychological Inventory 
(1958, p. 250). le 

The Edwards Personal Preference schedi 5 
(EPPS) (Edwards, 1957) was constructed n 
measure a set of personality variables dn s 
from Murray’s (1938) list of manifest nee i 
The unique feature of the Schedule is an ê 
tempt to control the social desirability (S in 
factor by means of a forced-choice format ri- 
which paired items scored for different in 
ables are equated for independently Lene 
SD. Control of SD would presumably ai? 
nate one means by which a test S can a “of 
Scores which are not truly characteris p 
him, that of responding in the socially ag 
able direction. ubt 

Recent evidence on the EPPS casts 4° on 
on the success of the control of SD a” jo? 
the resistance of the Schedule to simulat ig- 
Corah, Feldman, Cohen, Grune, and ait? 
wall (1958) found that 20 of 30 item P 


ows! 
* The names of the EPPS variables are a ig ord), 
Achievement (ach), Deference (def), Orde? ‘(aff 


Exhibition (exh), Autonomy (aut), Affiliation inant 
Intraception (int), Succorance (suc), D Chane, 
(dom), Abasement (aba), Nurturance (nu) 5, A 
(chg), Endurance (end), Heterosexuality ( 
gression (agg). A consistency score teo 
computed, based on the number of identica 
made in two sets of the same 15 items. 


372 


——— _ >) 


Simulated Patterns on the EPPS 37 


sampled from the EPPS differed significantly 
in intrapair SD when judged as pairs, and 
found a high correlation between these differ- 
ences and the probability of endorsement of 
the items. Borislow (1958) studied EPPS 
response changes in Ss who were first tested 
under standard instructions and then asked 
to role-play social desirability or personal de- 
Sirability. Both role-playing groups differed 
Significantly in number of item responses al- 
tered and in test-retest profile correlations 
from a control group retested under standard 
Instructions. Neither the consistency score 
Nor profile stability coefficients discriminated 
Simulated profiles from controls. Borislow 
interpreted his findings as indicating suscepti- 
bility of the EPPS to faking, but his small 
Ns prevented a descriptive analysis of score 
changes for the two role-playing samples, and 
he rejected the hypothesis of a differential 
effect of the two role-playing conditions. 

‘ The present study investigated the qualita- 
tive properties of EPPS score changes under 
four different role-playing instructions. The 
hypotheses were: (a) Subjects motivated to 
Simulate a personality trait are capable of 
inducing substantial changes in their EPPS 
Scores. (b) Substantial score changes will 
Occur under the role-playing of a “good im- 
Pression,” in spite of the attempted control 
of the SD factor. (c) Subject groups that 
role-play different personality variables will 
obtain different simulated patterns. (d) The 
Consistency score is not an effective index of 
Simulation, 


Method 


, The EPPS was administered with standard instruc- 
lions to 75 students in five introductory psychology 
Classes at the City College of San Francisco. The 
S for the experiment ranged in age from 18 to 30. 

€y were permitted to identify their records by 
ode numbers to preserve anonymity. 

he sample was then divided into four role-play- 
Ng groups: need order (ORD), 8 males, 9 females; 
need dominance (DOM), 8 males, 11 females; need 
Change (CHG), 13 males, 7 females; and good im- 
Yession (GI), 8 males, 11 females. The first three 
es €s were chosen to represent a variety of the EPPS 
ariables and to correspond roughly to three vari- 
1 bles under investigation in a parallel study of simu- 
ation of the California Psychological Inventory. The 
Ourth role relates to Hypothesis b. 

ach group was retested separately with instruc- 


iss) 


tions to simulate for the purpose of winning an im- 
aginary but highly desirable college scholarship. Sub- 
jects in each of the first three groups were told to 
suppose a hypothetical “scholarship committee” used 
the EPPS to select individuals with a particular kind 
of personality trait. The name of the need variable 
and a three- or four-sentence description based on 
Murray (1938) and reproduced below were read 
to the group and printed on a blackboard visible 
throughout the session. 

Need for order. A person with a need for order 
wants to achieve organization, neatness, and pre- 
cision, This kind of person aims for perfection in 
details, attempts to keep possessions and work in 
careful order, and is exact and precise in speech and 
manner. Persons with a need for order behave in 
an organized, restrained, and careful manner in what- 
ever they do. 

Need for dominance. A person with a need for 
dominance wants to influence, persuade, or direct 
other people by suggestion, persuasion, or command. 
This kind of person tries to get others to cooperate 
with him and to convince them of the rightness of 
his opinions. Persons with a need for dominance 
desire to lead, influence, guide, govern or supervise 
other people. Note that a person with a need for 
dominance need not necessarily be domineering or 
unpleasant in his conduct. 

Need for change. A person with a need for change 
seeks variety, newness, and adventure in personal ex- 


Table 1 


Means, Standard Deviations, and Mean Differences of 
EPPS Scores under Standard (Std) and Simu- 
lated Need Order (ORD) Conditions 


Std ORD 

Scale MW SD iW SD D 
ach 53.5 10.6 64.2 11.2 10.77" 
def 52.6 88 68.1 72 isise 
ord 53.2 11.9 83.9 87 30:7** 
exh 50.9 8.2 44.2 11.5 —6.8** 
aut 46.4 6.6 37.8 6.2 —BiGt™ 
all 45.2 8.1 39.3 7.6 —5.8** 
int 49.9 8.0 34 
suc 45.9 8.2 —0.6 
dom 47.8 15.0 —28 
aba 48.8 10.6 —1.2 
nur 40.8 83 —5.5** 
chg 31.9 10.5 —21.0** 
end 723 4.0 187** 
het 35.9 8.5 —12.7"" 
agg 42.4 9.1 —9.1** 
con 49.7 9.9 —1.9 


* Differs from zero at .05 level. 
** Differs from zero at .01 level. 


374 


Table 2 


viati Iean Differences of 
) , Standard Deviations, and M 5 
a Scores under Standard (Std) and Simu- 
lated Need Dominance (DOM) 


Conditions 
(N = 19) 
j Std DOM 
Scale M SD wW SD D 
ach 52.0 10.1 62.7 9.5 10.7** 
def 53.8 8.8 46.3 14.8 —7.6** 
ord 52.7 10.6 50.2 9.9 —2.5 
exh 47.0 10.3 63.8 11.7 16.8** 
aut S16. 12.3 49.4 95 23 
aff 43.4 9.0 49.5 81 gre 
int 51.6 10.2 46.5 88 —5:2™* 
suc 49.1 11.2 4470 81 —4.4e* 
dom 46.0 10.7 68.7 7.0 226 
aba 48.6 7.7 35.4 64 =i 
nur 47.3 8.6 49.2 15.0 1.9 
chg 52.5 9.6 36.7 8.2 = [igs 
end 53.6 7.7 50.0 8.1 =3:5* 
het 49.8 13.2 42.4 11.2 =n 
agg 49.7 10,2 58.9 8.7 9:3%* 
con 53.8 8.6 46.5 10.1 — 73 
* Differs from zero at .05 level. 
** Differs from zero at .01 level. 
periences. This kind of person avoids regularity or 
repetition in habits of living, attempting instead to 
experiment and to do things differently. Persons 


with a need for change are fl 
and enjoy changing their meth 
erences. 

The GI group was told to respond so as to give 
the most favorable Possible impression of themselves 
to the scholarship committee, without further specifi- 


cation of role. One week elapsed between the first 
and second test administrations for all groups. 


exible and adaptable 
ods, habits, and pref- 


Results 


Tables 1-4 show the means and standard 
deviations of the EPPS scores of the four role- 
playing groups for standard and simulation 
conditions. The mean difference scores (simu- 
lation condition minus standard 


condition) 
are also shown.* 


The means for the stand- 


4 The raw scores were converted to To 
appropriate to the sex of the S. Prelimi 
sis of the four samples indicated no subst 
differences in either standard or simulation condi- 
tions. There were no male-female reversals of the 
direction of mean change, scores where both change 
score means differed significantly from zero. Data 
for male and female Ss were combined for the main 
analyses. 


score valucs 
nary analy- 
bstantial sex 


Charles F. Dicken 


ard condition are comparable for the na 
samples and are, in the main, reasonably ¢ 
to those of Edwards’ normative sample. a 
The ¢ test for correlated measures was fo; 
to compare the mean differences as of 
hypotheses of zero difference. The pe bs 
the forced-choice format of the EP be 
score changes in simulation should be cs fi 
in interpreting the outcomes of the sig al 
cance tests for individual scales. An = 2 
item response which increases an S’s scor i6 
one variable also decreases his score on $0 ‘a 
other variable. Thus while the incense n 
any sample are independent of each H a 
and while the set of decreases is amen is 
ternally independent, the increases an it 
decreases are not independent. A eon 
tive interpretation would consider e SIE aii 
cance of the changes for a single direc in- 
only (increases being probably of ee 
terest here), and would treat the cere: 
changes (e.g., decreases) in terms of rela 
magnitude only. the 
The effect of the role instructions 0n 


Table 3 


‘ferences 
Means, Standard Deviations, and Mean | fone 
EPPS Scores under Standard (Std) and Sin 
lated Need Change (CHG) Conditions 


Ww = 20) gt 
CHG 
a D 

Scale M SD 

aid 49 
ach 47.2 12.9 agr 
def 470 84 Tos 
ord 433 99 10.0** 
exh 57.7 12.8 13.0"" 
aut 60.4 10.1 3.6 
afi 43.0 7.6 = 
int 46.1 66 _8 
suc 48.6 74 69" 
dom 51.7 8.7 ef 10.4" 
aba 428 8.7 E 
nur 43.0 10.1 12.3" 
chg 65.1 10.2 6.6" 
end 45.0 10.2 = m 
het 50.0 7.9 7.0" 
age 59.5 8.2 _4.0° 
con 43.3 14.4 


s Differs from zero at .05 level. 
** Differs from zero at (01 level. 


ST 


Simulated Patterns on the EPPS 


Table 4 


Means, Standard Deviations, and Mean Differences of 
EPPS Scores under Standard (Std) and Simu- 
lated Good Impression (GI) Conditions 


(N = 19) 

Std GI 
Scale M SD M SD D 
ach 52.0 81 63.0 8.0 11.0** 
def 47.8 84 68.6 7.2 20.8** 
ord 54:5 125 71.8 8.7 173" 
exh 52.4 14.4 44.7 10.0 —7.7** 
aut 50.3 9.7 38.3 10.7 —12.0* 
afi 47.0 6.6 429 71 —4.1* 
int 47.9 10.0 50.4 6.7 2.5 
suc 50.4 10.0 42.5 7.3 —8.0* 
dom 51.2 10.3 52.4 98 132 
aba 48.2 9.7 46.9 74 —1.2 
nur 48.5 84 45.9 10.1 =25 
chg 50.4 11.5 42.2 89 —8.2** 
end SF uS 67.8 78 16.1** 
het 46.5 11.7 34.6 8.5 —11.8** 
agg 51.0 7.8 41.6 9.1 —9.4** 
con 47.8 11.5 47.5 11.9 —.3 


* Differs from zero at .05 l 
** Differs from zero at .01 level. 


similarity of individual EPPS profiles within 
ach sample is shown in Table 5. Score pat- 
terns of individuals show little concordance 
'n the standard conditions, but have a highly 
Significant level of concordance in every simu- 
ation condition. This indicates a shift from 
an “individual” pattern of responses to a 
tole-characteristic” pattern when the S simu- 
ates. Borislow’s (1958) concordance values 
‘or his simulated social desirability (SD) and 
Personal desirability (PD) groups are in- 
cluded in the table for comparison. The pres- 
at good impression group appears to have 
Simulated in a more homogeneous fashion 
an the earlier SD group. 
he large and statistically reliable mean 
changes in all samples and the consistent con- 
mrdance shifts confirm Hypotheses a and b. 
© differences in the mean simulated pat- 
“ths and the between-condition correlations 
i Mean changes (Table 6) confirm Hypothe- 
Sime with one exception. The three trait- 
in ulation conditions induced mean changes 
the 15 variables which are either essentially 


ios) 
pS] 
on 


uncorrelated or negatively correlated. In 
each case the pattern of prominently elevated 
scores is different, and the peak score is on 
the relevant variable. However, conditions 
ORD and GI yielded changes which are 
highly correlated, and mean simulated pro- 
files which are for practical purposes indis- 
tinguishable. 

Edwards found relatively low intercorrela- 
tions of the EPPS variables in the normative 
sample. However, the simulation instructions 
in the present experiment induced significant 
changes in scales other than the “primary” 
scale for which the instructions were written. 
One hypothesis which might account for the 
changes in the “nonprimary” scales is that 
these changes relate to the size of the correla- 
tions of the nonprimary scales with the pri- 
mary scale, even though the correlations are 
of a generally low order. The rank difference 
correlations between the amount of change in 
nonprimary scales and the magnitude of the 
normative sample correlations of these scales 
with the primary scale are positive and sig- 
nificant in conditions ORD (rho = .85) and 
DOM (rho = .55), but there is no associa- 
tion in condition CHG (rho = .03). There 
is no immediate explanation for the failure 
of the hypothesis in the CHG condition, al- 
though it may be noted that the score changes 
and the concordance shift are least in this 
condition. 

Hypothesis d is confirmed by the data from 
all four conditions. Although the mean con 
score decreased in all conditions, the decreases 


Table 5 


Kendall Coefficients of Concordance (W) of EPPS 
Profiles in Standard (Std) and Simulation (S) 
Conditions for Four Role-Playing Samples 
and for Borislow’s SD and PD Samples 


Sample Std S 
ORD Psi eg 
DOM 04 
CHG 08 
GI 04 4 
Borislow PD 26%" 
Korislow SD 38** 


* TV Significant at .05 level. 
++ IV Significant at .01 level. 


376 


Table 6 


Between-Condition Spearman Correlations of Ranks of 
Mean Change Scores for Fifteen Need Variables 


Condition 
DOM —.02 
CHG —.67* -22 
GI .90* —.02 — As 
ORD DOM CHG 
* Rho differs from zero at .01 level (no values significant at 
.05 level). 


are not significant in two groups, and the 
overlap of con scores for simulation and 
standard conditions is large in all groups. 
Even if a liberal cutting score of 10 or less 
raw score points on con is used as a simula- 
tion index (which would identify as “simu- 
lated” 15% of the records in the normative 
sample), only the following relatively small 
Proportions of the simulated records would 
be detected: ORD, 5/17; DOM, 6/19; CHG, 
8/20; and GI, 5/17. Unsimulated records 
from these samples misidentified by this cut- 
ting score would be seven, two, seven, and 
four cases respectively. 


Discussion 


One of the most important findings is the 
failure of the social desirability pairings of 
items to control the distorting effect of test- 
taking attitudes. The changes in the GI 
condition tend to support the conclusion of 
Corah et al. (1958) that SD is not equated 
in some of the pairs. Even if the item format 
partly controls social desirability bias, which 
seems likely, the role-playing data suggest 
that distortion of the EPPS by simulation of 
characteristics other than SD remains a dis- 
tinct possibility. 

Since the Ss in the three trait-simulation 
conditions were given descriptions of the vari- 
ables they were to simulate, the question 
arises whether these experimental distortion 
sets are meaningfully related to conscious or 
unconscious role taking in a normal testing 
situation. The reader may verify the degree 
of similarity of the role instructions to the 

content of the EPPS items by reference to the 
scoring keys. Some correspondence was un- 


Charles F. Dicken 


avoidable because of the “obvious” character 
of the items. In general, however, the role 
instructions make a broader and more ab- 
stract reference to the need variables than do 
the items, suggesting that to some degree true 
role taking rather than information on spe 
cific item content determined the — 
The success of the essentially uninstructee 
“scholarship applicants” in Condition c a 
simulating traits (order, achievement, €n at 
ance, and deference) which are both 
and “desirable” with respect to the goal fa 
sought argues rather cogently against ass 
ing that simulation could not occur eer A 
Ss with specific information about the ins 
ment. : 

The evidence for susceptibility to distortion 
gives cause for question of the feasibility P 
constructing an instrument for variables or 
this type without a systematic procedure 
determining item and scale validities. ‘of 
Manual makes no reference to item select! 
other than for social desirability values. 
nature and arrangement of the items sue 5 
that the questions of subtlety and of © eg- 
prehensiveness of content were similarly me 
lected. Scores for each scale are deter™ 
by endorsement of a very small ane 
statements (nine), because statements a" 
peated in identical form in the sets ° 
item pairs scored for each scale. 
ments for each variable appear highly tent: 
valid” and are strikingly similar in on" 
The effect of face validity and content ee 
Seneity in facilitating selective endors¢ aug 
of a particular kind of item is probably 5 k- 
mented by the arrangement of the test for 
let. More than half the statements sO u 
each scale appear in “runs” of five CO” 
tive item pairs. jimi 

The effect of conscious distortion, tPe gs; 
tations in content and subtlety of the i; sug 
and the inadequacy of validity data n 

5 r tby 

5 The Manual contains no validity data sihr allt 
low correlations of the scales with other P& relat pt 
scales and mention of some inconsistent cor tee 
with self-ratings. Subsequent studies 0! poe ih 
Validity of the EPPS have given partly iwori 
(Zuckerman, 1958), and partly negative |< 7 
1958; Himmelstein, Eschenbach, & CarP» ‘en mol 
ings. Construct validity studies have IVC jes 
positive and negative findings (Bernardin 4955)" 
1957; Gisvold, 1958; Zuckerman & Gros’ 


ests 


pt 


Simulated Patterns on the EPPS 


gest there is relatively little basis at present 
for regarding EPPS scores as measures with 
properties other than those of a self-report. 
If this conclusion is correct, interpretation of 
a score as a measure of an examinee’s actual 
characteristics rests to the assumption that he 
is both (a) able to perceive his own charac- 
teristics accurately and (b) willing to report 
these perceptions candidly. Meehl’s rationale 
for empirically constructed scales, “the scor- 
ing does not assume a valid self-rating to have 
been given” (1945, p. 299), cannot be used. 
In selection problems, (a) is usually unknown 
and (b) often false. In counseling, (b) may 
often be assumed, but if (a) is correct the 
need for a personality inventory may be viti- 
ated. The usefulness of earlier personality 
inventories dependent on validity of self-re- 
port has been disappointing (Ellis, 1946). 

A final and important practical implication 
of the findings is that the lack of effective va- 
lidity indices for detecting distorting attitudes 
is one of the most crucial weaknesses of the 
EPPS in its present form. 


Summary 


The simulability of the Edwards Personal 
Preference Schedule was studied in four role- 
playing experiments. Large and reliable 
changes were found in each of three scales 
when Ss were instructed to present them- 
selves as possessing the trait which the scale 
was designed to measure. Changes were also 
found in scores for traits other than the one 
simulated. Simulation of a “good impression” 
yielded substantial and reliable score changes. 
The patterns of mean scores obtained in the 
different role-playing conditions were differ- 
ent. The consistency score did not discrimi- 
nate the simulated records. The results were 
discussed with reference to the failure of the 
attempted control of the social desirability 
factor in eliminating the effect of test-taking 
attitudes, the problem of subtlety, and the va- 
lidity and practical usefulness of the instru- 
Ment, 


Received January 22, 1959. 
REFERENCES 


BERNARDIN, A., & Jessor, R. A construct validation 
of the Edwards Personal Preference Schedule with 


Si 


respect to dependency. 
21, 63-67. 

Borpix, E. S. A theory of vocational interests as 
dynamic phenomena. Educ. psychol. Measmt., 
1943, 3, 49-66. 

Bortstow, B. The Edwards Personal Preference 
Schedule (EPPS) and fakability. J. appl. Psy- 
chol., 1958, 42, 22-27. 

Coran, N. L., Ferpman, M. J., Conex, I. S., Gruen, 
A. W, & Rinewart, E. A. Social desirability as 
a variable in the Edwards Personal Preference 
Schedule. J. consult. Psychol., 1958, 22, 70-72. 

Ditwortn, T. A comparison of the Edwards Per- 
sonal Preference Schedule variables with some as- 
pects of the TAT. J. consult. Psychol., 1958, 22, 
486. 

Epwarps, A. L. Edwards Personal Preference Sched- 
ule, Manual. (Rev. ed.) New York: Psychologi- 
cal Corp., 1957. 

Ertis, A. The validity of personality questionnaires. 
Psychol. Bull., 1946, 43, 385-440. 

Gisvorp, D. A validity study of the autonomy and 
deference subscales of the Edwards Personal Pref- 
erence Schedule. J. consult. Psychol., 1958, 22, 
445-447. 

Goucn, H. G. Simulated patterns on the MMPI, 
J. abnorm. soc. Psychol., 1947, 42, 215-223, 

Goucu, H. G. The F minus K dissimulation index 
for the MMPI. J. consult. Psychol, 1950, 14, 
408-413. 

Goucu, H. G. On making a good impression. J. 
educ. Res., 1952, 46, 33-42. 

Goucu, H. G. Some common misconceptions about 
neuroticism. J. consult. Psychol, 1954, 18, 287- 
292. 

Goucu, H. G. Manual for The California Psycho- 
logical Inventory. Palo Alto: Consulting Psy- 
chologists Press, 1957. 

HIMMELSTEIN, P., ESCHENBACH, A., & Carp, A. In- 
terrelationships among three measures of need 
achievement. J. consult. Psychol., 1958, 22, 451- 
452. 

Humm, D. G., Storment, R. C., & Iorns, M. E. 
Combination scores for the Humm-Wadsworth 
temperament scale: With consideration of the ef- 
fects of subject’s response bias. J. Psychol., 1939, 
7, 227-253. 

Jackson, D. 
in personality assessment. 
55, 243-252. 

Ketty, E. L, Mues, C., & Terman, L. M. Ability 
to influence one’s score on a typical pencil-and- 
paper test of personality. Charact. & Pers., 1935, 
4, 206-215. 

Loncstarr, H. P. Fakability of the Strong Voca- 
tional Interest Blank and the Kuder Preference 
Record. J. appl. Psychol., 1948, 32, 360-369, 

Meeut, P. E. The dynamics of “structured” per- 
sonality tests. J. clin. Psychol., 1945, 1, 296-303. 

Meeut, P. E., & Haraway, S. R. The K factor 
as a suppressor variable in the MMPI. J. appl. 
Psychol., 1946, 30, 525-564. 


J. consult. Psychol., 1937, 


, & Messick, S. Content and style 
Psychol, Bull, 1958, 


378 

Murray, H. A. Explorations in Personality. New 
York: Oxford Univer., 1958. 

Seeman, W. “Subtlety” in structured personality 
tests. J. consult. Psychol., 1952, 16, 278-283. 

Sunpperc, N. D., & Bacneris, W. D. The fakability 
of two measures of prejudice: The California F 
scale and Gough's Pr scale. J. abnorm. soc. Psy- 
chol., 1956, 52, 140-142. 

SWEETLAND, A. Hypnotic neurosis: Hypochondriasis 
and depression. J. gen. Psychol., 1948, 39, 91-105. 


Charles F. Dicken 


Wesman, A. G. Faking personality test scores in 4 
simulated employment situation. J. appl. Psychol., 
1952, 36, 112-113. 

Wiener, D. N. Subtle and obvious keys for the 
MMPI. J. consult. Psychol., 1948, 12, 164-170. 
ZUCKERMAN, M. The validity of the Edwards Per- 
sonal Preference Schedule in the measurement of 
dependency—rebelliousness. J. clin. Psychol, 1958, 

14, 379-382. 

ZUCKERMAN, M., & Grosz, H. Suggestibility and 

dependency. J. consult. Psychol., 1958, 22, 328. 


te Ico 


Journal of Applied Psychology 
Vol. 43, No. 6, 1959 ` ú 


THE EFFECTS OF PARTIAL PAIRING ON SCALE 
VALUES DERIVED FROM THE METHOD 
OF PAIRED COMPARISONS 


W. W. RAMBO 


Oklahoma State University 


One frequently mentioned criticism of the 
method of paired comparisons is the great 
number of observations required to scale a 
set of stimuli. In an attempt to reduce the 
experimental and computational labor asso- 
Clated with a complete pairing of stimuli, 
there have been four general techniques ad- 
vanced which permit the determination of 
Scale values from a partial pairing of stimulus 
Objects. With some minor modification the 
Computation of scale values is the same for 
these procedures as is that required by a com- 
Plete presentation of stimulus pairs, and the 
assumption is made that the obtained values 
are similar to those derived from the com- 
Plete pairing. 

One technique requires an initial ordering 
of stimuli by means of some less laborious 
Procedure, such as ranking, or the method of 
€qual-appearing intervals. A relatively few 
Stimuli are selected which occupy positions 
Covering the entire length of the ordered se- 
ries, and then all stimuli are compared with 
hese “representative” standards. A deriva- 
tive of this procedure requires judges to com- 
Dare only those stimuli which lie fairly close 
t0 one another on the preliminary scale. It 
'S assumed that comparisons made between 
Stimuli that are widely separated in the se- 
Nes would reflect perfect discriminability, and 

erefore they would contribute little to the 
€termination of scale values. 

A third procedure requires that the com- 
blete pairing matrix be divided into a number 

submatrices each containing a number of 
ommon stimuli. Complete pairings are made 
vithin each submatrix, and the final scale 
alues are obtained through adjustments made 
‘Sing the several scale values assigned to the 
timuli held in common by each matrix. 
eer tally, a fourth procedure consists of se- 
ju mg a random number of pairs from the 

i Sment matrix in order to arrive at an un- 
ased estimate of the scale values. This 


procedure requires less effort since it elimi- 
nates the necessity for preliminary ordering 
and adjustment of submatrix scale values. 

For all of the above procedures there is 
little available in the way of empirical evi- 
dence which demonstrates the effects of the 
reduction in the number of pairs on the scale 
values derived from the method of paired 
comparisons. One study reported by Mc- 
Cormick and Bachus (1952) indicates that 
the number of pairs can be drastically re- 
duced without seriously influencing the scale 
values obtained. These authors present a 
method of selecting random pairs which con- 
sist of randomly assigning stimuli to positions 
along the borders of the judgment matrix and 
then selecting sets of matrix diagonals so that 
all stimuli enter into the same number of 
comparisons. 

The procedure followed in their investiga- 
tion consisted of instructing judges to rate a 
large number of stimuli which were presented 
in a complete pairing series. Scale values 
were computed from these judgments. Next, 
the number of pairs was systematically re- 
duced by drawing successive subsets of pairs 
from the original matrix, and then scale values 
were recomputed. For one group of 30 stimuli, 
four partial pairing groups were used which 
employed 24, 15, 12, and 8 pairs per stimulus. 
Correlations between the scale values obtained 
from these partial pairing schedules and those 
obtained from the complete matrix ranged 
from .996 down to .898. 

One factor completely confounded by the 
procedure followed in the study just cited is 
the rating task presented to the judges. The 
judgments used in the four partial-pairing se- 
ries were extracted from the complete pairing 
matrix, hence all of the scale values obtained 
were derived from sets of judgments that were 
made within the context of the complete pair- 
ing situation. Therefore, fatigue. memory, 
and set factors which doubtlessly vary with 


380 


Table 1 


i iring Tasks Assigned 
lete and Partial Pairing Tasks 
= to the Six Experimental Groups 


Pairs per Total N 
Group Stimulus of Pairs 
29 435 
: > _ 375 
B 20 300 
H 15 225 
D 10 150 
E 5 75 


the reduction in the number of pairs were 
identical in all partial pairing matrices. 

The purpose of this investigation is to de- 
termine the relationship between scale values 
computed from complete and partial pairings 
when the rating task as well as the number 
of observations is permitted to vary with 
the reduction in the number of pairs. Sepa- 
rate groups will be used for each of the partial 
pairing schedules, therefore permitting scale 
values to be determined within the context of 


the rating task required by each of these con- 
ditions. 


Method 


The stimuli considered in this study were 30 na- 
tionality group names which have been extracted 
from the Borgardus Scale of Social Distance (1947). 
The stimulus pairs were printed on decks of IBM 
cards; the order of presentation and location of each 
stimulus was determined by tables Presented by Ross 
(1934). This series yields the maximal separation of 
stimulus pairs that have one 
and it counterbalances possible 
Ten decks of cards were prepa: 
experimental groups which refle 
duction in the number of 
scale values for these na 
schedules were determined 
and Bachus procedure, 


member in common, 
Position preferences, 
red for each of six 
cted a successive re- 
pairs used to compute the 
tionality groups. Pairing 
following the McCormick 


Table 1 shows the number 
of cards per deck and the number of observations 


per stimulus which went into the definition of the 
six rating conditions employed in this study. It will 
be noticed that the smallest deck (Condition E) 
represents approximately an 82% reduction in the 
number of pairs required by the complete pairing 
schedule (Condition Z). 

The Ss used in this study were 36 males 
female white undergraduate students wi 
rolled at Oklahoma State Uniy 
tary psychology course. 
signed to one of six ex 
of the six groups was 
of the Ss in one exper 


and 24 
ho were en- 
ersity in an elemen- 
These Ss were randomly as- 
perimental groups, and each 
then Segregated so that none 
‘imental Sroup was aware of 


W. W. Rambo 


the difference in rating tasks assigned to the other 
groups. ES. 
e Tdentical instructions were read to the six ee 
mental groups. Each S was asked to go tone a 
deck of cards which was placed in front of s ae 
judge, for each nationality group pair, which ee 
ber of the pair was most preferred by the aed 
American.” Ss indicated their judgments by mae 
ing a line under each nationality group a othe ‘Ss 
lected. The instructions also indicated that | oe 
would be asked to fill out a short garstiegnaits K 
they had finished the rating task. This pa 
was the E Scale (Adorno, Fete Bena ail 
son, & Sanford, 1950) which purportedly gl hel 
estimate of the extent of ethnocentric ri a 
by the S. The scale was administered in rs aie 
gain information which would reflect on A în as- 
quacy of the randomization procedures use 52: clase 
signing Ss to experimental conditions. A single fod 
sification analysis of variance was run to de ntric 
whether or not there were significant clhina 
attitude score differences existing between the Ge ala 
The F value obtained (1.29; 5, 54 df) was not SIF 
nificant at the 0S level. 


Results 


alues 
For each experimental group scale pi 
were determined for each of the 30 stim as- 
This analysis was carried out under e 
sumptions required by Case III of the e ae 
comparative judgment. Hence, using forn o 
presented by Burros (1951), estimates nts 
stimulus discriminal dispersions were were 
puted. Pearson correlation coefficients e 
computed in order to estimate the dest 


5 
ao value 
association existing between the scale those 
determined by the complete pairing an ir- 


obtained from each of the five partial, Pis 
ings. Table 2 presents the results ° e CO" 
analysis. Here it can be seen that the hree 
efficients are quite high for the p for 
partial pairing groups, but they drop 

the two smaller pairing schedules. 


Table 2 pip 
ations! 
Correlation Coefficients Estimating rer 
between Scale Values Obtained non eS ye 
Plete and Partial Pairing Schedules _ 4 


Group _ A 
: : 
B : 
c 93 
D 82 
E ae a 


m 


š 


Efiects of Partial Pairing on Scale Values 381 


These coefficients were transformed into 
Fisher Z values and one-tailed ¢ tests were 
run to determine whether there was a signiñ- 
cant reduction in the size of 7 as the number 
of pairs are reduced. Since one variable 
(scale values from Group Z) is held in com- 
mon by all correlation coefficients this test is 
biased on the conservative side. 

The results of this analysis indicated that 
there were no significant differences between 
the coefficients obtained from Groups A, B, 
and C. However, a significant difference was 
obtained between the coefficients computed 
ftom Pairing Schedules C and D. This dif- 
ference was significant at the .05 level. The 
difference between Schedules D and E was 
Not significant at the chosen significance level. 


Discussion 


The magnitude of the coefficients obtained 
from relating the partial pairing scale values 
With those obtained from the complete pair- 
Mg schedule indicates that the method of 
Paired comparisons can withstand a substan- 
Hal teduction in the number of pairs without 
Seriously altering the scale values obtained. 

t will be recalled that scale values derived 
from a partial pairing schedule which yielded 
APproximately a 50% reduction in the num- 

er of pairs still correlated .93 with the scale 
Values obtained from a complete pairing 

Schedule, This coefficient, while high, is sig- 
nificantly lower than those reported by Mc- 
~°rmick and Bachus for a similar reduction 
i the number of pairs (.99 and 98). of 
Ourse, the willingness of an investigator to 
accept alterations in scale values depends 
‘Pon the accuracy required by the purposes 
X his research. However, in many applied 
Situations the time and labor savings associ- 

ted with a 50% reduction in the number of 

Observations would far outweigh this distor- 
tn of scale values. Beyond this point, how- 
‘ver, the results of this study indicate that 
“ther reduction yields significant modifica- 
‘ons in scale value estimates. 

. `t is the contention of this paper that par- 
Ne Pairing is more than a statistical consid- 

‘ation, Although the reduction in the num- 
sh : of observations obviously influences the 

ability of the scale values, it also results in 

Modification of the over-all rating task re- 


quired of the judges. For instance, the initial 
task set is conceivably modified when raters 
are presented with a smaller number of pairs. 
Doubtlessly, the influence of certain fatigue or 
boredom factors is modified by reducing the 
number of pairs, and it is quite probable that 
memory for past judgments plays a more sig- 
nificant role in partial pairing schedules since 
fewer judgments intervene between the succes- 
sive appearance of a given stimulus. There- 
fore, the investigator should consider the ex- 
perimental implications of these modified task 
variables before he decides on a substantial 
reduction in the number of pairs he employs 
in his paired comparisons scaling. 


Summary 


The study just reported represents an at- 
tempt to describe the relationship between 
scale values derived from partial and com- 
plete paired comparisons judgments. One 
complete and five partial pairing schedules 
were used, and 60 Ss, who had been randomly 
assigned to one of these six conditions, scaled 
the names of 30 nationality groups. 

The results of the study indicated that par- 
tial pairing scale values rather closely approxi- 
mated those obtained from a complete pairing 
when the number of observations was reduced 
as much as 50% of the complete pairing 
matrix. Beyond this point further reduction 
seemed to yield a more drastic modification 
in the scale values obtained. The rating task 
implications of partial pairing were discussed. 


Received January 26, 1959. 


REFERENCES 


Aporno, T. W., FrenKet-Brunswik, ELSE, LEVIN- 
son, D. J., & Sanrorp, R. N. The authoritarian 
personality. New York: Harper, 1950. 

Borcarpus, E. S. The measurement of social dis- 
tance. In T. M. Newcomb & E. L. Hartely (Eds.), 
Readings in social psychology. New York: Holt, 
1947. Pp. 503-507. 

Burros, R. H. The application of the method of 
paired comparisons to the study of reaction po- 
tential. Psychol. Rev., 1951, 58, 60-66. 

McCormick, E. J., & Bacuus, J. A. Paired com- 
parison ratings. I. The effect on ratings of reduc- 
tions in the number of pairs. J. appl. Psychol., 
1952, 36, 123-127. 

Ross, R. T. Optimum orders for the presentation 
of pairs in the method of paired comparisons. J. 
educ. Psychol., 1934, 25, 375-382. 


al of Applied Psychology 
Vass, ras 1959 


AN EXPERIMENTAL INVESTIGATION OF 
SUBLIMINAL PERCEPTION 


JOHN M. CHAMPION 


Georgia State College of Business 


axb WELD W. TURNER 


General Motors Institute 


Increasing interest is being directed to the 
possibility that subliminal presentation may 
be useful, or effective, in such activities as 
advertising, attitude modification, or persua- 
sion. Several recent evaluations (McConnell, 
Cutler, & McNeil, 1958; Naylor & Lawshe, 
1958) of available scientific evidence suggest 
that further experimentation is necessary to 
demonstrate the validity of claims that be- 
havior can be influenced by subliminal stimu- 
lation. 

The following experiment was designed and 
conducted to determine if a visual stimulus 
presented subliminally to a group of Ss would 
influence recognition or association responses. 
These responses were obtained immediately 
after the presentation of the stimulus, If re- 
sponses were found to be related to the stimu- 
lus presentation, then “subliminal perception” 
could operationally be said to have been dem- 
onstrated. By defining subliminal perception 
in this manner, interpretations would not be 
extrapolated beyond the data collected, 

If the phenomenon of subliminal perception 
actually occurs, its effects should appear in 
the relatively simple responses of recognition 
and association. Investigations of the effect 


of subliminal perception on more complex be- 


havior, such as learning, persuasion, and atti- 


tude change, would be more legitimate after 
demonstrating its effect on less complex re- 
sponses. 


Procedure 


Apparatus. A 16 mm. film Projector was us:d to 
project a 30-min. film based on a sales administra- 
tion textbook entitled The Bettger Story. A slide 
projector capable of projecting slides of 3” X 5” was 
used to flash slide figures on the screen during the 
showing of the film by means of an attached lens 
with a shutter. The shutter permitted discrete ad- 
justments for speed and continuous adjustments for 
aperture opening. Two slides were prepared, one to 


serve as an experimental stimulus and the other 
a control stimulus. Slide A, the experimental apo 
lus, was an original drawing of a spoon of rice w 
the words “Wonder Rice” printed below on a ait 
background. Slide B, the control stimulus, Hed 
consisted of four lines placed in a nonsensical ma ; 
ner on a black background. Slide A was prepare 

to be presented to the experimental group and Sine 
to the control group. Slide B was intended to së 
nonsensical in nature and used purely for the pupi 
of reproducing conditions experienced in the lel 
mental group as nearly as possible with the excep 

tion of the actual meaningful stimulus. 

Subjects. Two groups of Ss enrolled in <a 
and advertising class at Purdue University were vie 
as a control group and an experimental group- ing 
administration of the experiment took place dur 
two consecutive class periods on the same day. imi- 

Method. For the purposes of this study isl a 
nal perception was defined as the presentation 4 a 
stimulus visible under constant exposure at SUC te 
speed as to bring it below the threshold of € be 
scious awareness. There were three variables 1 5 
coordinated in the presentation of the slide stim ari- 
in order to achieve subliminal perception. The Yide 
ables were exposure time, aperture opening, and § 
construction. 

A lens capable of shutter speeds of .01 5e 
attached to the turret of the slide projector. 
slide projector lens was set at a speed of | il the 
and the aperture opening reduced gradually ae reen 
slide was no longer visible when flashed on the 5° yen 
while the film was being shown, In order to pie’ he 
the film from masking the stimulus (figures at” 
slides) it was necessary to ensure that the ee K 
was visible under constant exposure when mp nd 
posed on the film being presented. It was slides 
that this requirement could be achieved when upt, 
with white background and black figures wer - and 
imposed on the film by adjusting exposure m the 
aperture opening. However, a white flicker : na 
film was detectable when this type of slide ie phe 
sented at the relatively slow speed of .01 sides m 
flicker was eliminated by redesigning the n wer! 
that the background was black and the figure’, ced 
white. Thus it was assured that a stimulus ne from 
at the set aperture opening was being reflect ing as 
the screen with subliminal presentation resu á 
a function of shutter speed. e o 

The slides presented to their appropriate 10-88% 
were projected for a duration of .01 sec. 4 


a sales 


c was 
e 


Investigation of Subliminal Perception 


Table 1 


Tabulations of Questionnaire Responses 


Monarch Wonder Total 

Control group responses 

Yes 7 2 9 

No 13 3 ap 

Total 20 5 25 
Experimental group 

responses 

Yes + 1 5 

No 10 4 14 

Total 14 5 19 


intervals during the 30-min. film. At the conclusion 
of the film a questionnaire was given the Ss for the 
Purpose of determining the effect of the subliminal 
Presentation, The questionnaire consisted of a repro- 
duction of the illustration used in Slide A, that of 
the spoon of rice, but this time the words “Wonder 
Rice” were omitted. The following was also in- 
cluded on the questionnaire: 


At one time you may have seen the above adver- 
tisement illustration used to promote the sale of 
rice. Check below according to your recognition 
of this advertisement. 

Yes 


No 


Regardless of whether you recognize the illustra- 
tion check the brand from the following list when 
you believe to be most likely associated with the 
illustration. 

MONARCH 


WONDER 


_ The first questionnaire response was designed to 
indicate whether the Ss recognized the stimulus fig- 
Ure. The second response was designed to indicate 
Whether the Ss associated the hypothetical figure 
With the brand name. The brand names used were 
Selected on the basis of brands believed to be some- 
What common to the Ss but not predominant. 

The Ss had no knowledge that an experiment was 
cing conducted. They were only told that a film 
related to their course in sales and advertising was 
cing presented. In an attempt not to arouse sus- 
Picions regarding the questionnaire, the Ss were told 

at the questionnaire was being presented as a corol- 
UY to other requirements of their course work. 


Results 


Answers to the questionnaire items were 
gr bulated for the control and experimental 
Soups, These tabulations are shown in 


383 


Table 1. The chi-square technique was used 
to evaluate the differences between the ex- 
perimental and control groups according to 
their recognition of the stimulus figure. 
Table 2 shows the results of the test of the 
hypothesis that the experimental group was 
drawn from a population having proportion- 
ately the same frequency breakdown as the 
control group. The resulting chi-square value 
was not significant. 

The chi-square technique was again used to 
compare the experimental and control groups 
according to their association of brand names 
with the stimulus figures and the results are 
given in Table 3. Not only were the answers 
to the items not significantly different for the 
two groups, but responses for both items were 
in the “wrong” direction. For example, in 
the experimental group significantly more peo- 
ple (at the .05 level) checked the incorrect 
brand name than would have been expected 
to do so by chance. This indicated that a 
pre-experimental bias may have existed and 
illustrates the necessity for using a control 
group rather than merely testing for signifi- 
cant deviations from chance expectancies. 

The chi-square values reported in Tables 2 
and 3 were not corrected for lack of continuity 
in the discrete frequencies as is usually done 
for chi-square tests with only one df. A cor- 
rection for continuity would merely reduce 
the chi-square values which are already too 
small to be significant. 


Conclusions 


The results of this study indicate that sub- 
liminal presentation had no effect on the re- 
sponses of the Ss in recognizing the stimulus 
figure or of associating the brand name with 
the stimulus figure. It was felt that sublimi- 


Table 2 


Control Group vs. Experimental Group on 
Recognition of Stimulus Figure 


Response 
No Yes Total 
Observed (experimental) 14 5 19 
Expected (control) 12.16 6.84 19 
x? (uncorrected) = .773; p > 30 


384 


Table 3 


Control Group vs. Experimental Group on Association 
of Brand with Stimulus Figure 


Response 

Monarch Wonder Total 
Observed (experimental) 14 5 19 
Expected (control) 15.2 3.8 19 


x (uncorrected) = 474; p > .30 


nal perception should be demonstrated at this 
level of response before investigating the ef- 
fects of subliminal presentation on more com- 
plex behavior, such as buying specific prod- 
ucts or influencing public opinion. 

Although responses to the questionnaire 
deviated significantly from a chance distribu- 
tion, the responses were in the opposite direc- 
tion from what would be expected if sublimi- 
nal perception existed, and the experimental 
group responses did not differ significantly 
from the control group responses. 

There may be subliminal “thresholds” of 
perception, similar to the conventional thresh- 
old or limen for the awareness of sensations. 


John M. Champion and Weld W. Turner 


Failures to produce evidence of subliminal 
perceptions could be attributed to stimulus 
presentations at levels below the subliminal 
“threshold.” Limitations in the flexibility of 
available equipment prevented the present au- 
thors from searching for thresholds of sub- 
liminal impressions. 

If subliminal perception occurred, it did not 
affect questionnaire responses. Subliminal per- 
ception could conceivably be demonstrate 
with a method similar to that used here by 
presenting the stimulus more frequently O" 
by simplifying the design of the stimulus. 
However, in view of the results of this study, 
it appears that the burden of proof is placed 
on those who insist that subliminal perception 
is capable of influencing behavior. 


Received February 9, 1959. 


REFERENCES 
McConnett, J. V., Cutter, R. Lọ, & MCNEIL Be H 
Subliminal stimulation: An overview. Amer. Psy 
chologist, 1958, 13, 229-242. 
Navtor, J. C, & Lawsne, C. H. An analytical 1° 


view of the experimental basis of subception- 
Psychol., 1958, 46, 75-96. 


Á el OO OL Oe” —™”—C 


Journal oj Applied Psychology 
Vol. 43, No. 6, 1959 


THE GSR IN THE DETECTION OF GUILT’ 


DAVID T. LYKKEN 


University of Minnesota 


Use of the lie detector depends on the as- 
sumption that there is a distinctive pattern 
of physiological response which accompanies 
lying and which can be distinguished from 
that which accompanies truth telling. Most 
modern lie detector operators expect lying to 
produce a greater amplitude of physiological 
tesponse, although others have asserted that 
certain qualitative differences are character- 
istic (e.g., Marston, 1938, p. 52; Summers, 
1939). Claims of high validities for these 
methods do not find support in properly con- 
ducted empirical study. The most extensive 
research thus far reported (Ellson, Davis, 
Saltzman, & Burke, 1952), which employed 
a total of 13 response variables and careful 
multivariate statistical analysis, achieved only 
73% correct classification, against a chance 
expectancy of 25%. 

Use of physiological measurements to de- 
tect not lying, but the presence of “guilty 
knowledge,” requires only the more reason- 
able assumption that a guilty person will 
show some involuntary physiological response 
(e.g., GSR) to stimuli related to remembered 
details of his crime. If the crime is such that 
the investigator can discover a number of 
factual details with which only the guilty 
Person should be familiar, then the guilty 
knowledge method can be used. The guilty 
knowledge items are interspersed with other 
similar but irrelevant items in a stimulus list. 
The S is told that Æ is going to mention a 
Number of items and that, if he is guilty, he 
Will recognize some of these as being related 
to the crime in question. The items may be 
Stated in question form, in which case the S 
May or may not be required to answer. 

A guilty S, knowing which items are rele- 
Vant and which are not, would be expected to 
respond differently to the relevant than to the 
Mrelevant items. Usually, he would be ex- 
Pected to give larger responses to the relevant 


items, although it should be pointed out that 
‘cece 


1 


due Richard Rose, George Skaff, and Joe Ylitalo con- 


ted this experiment. 


any consistent difference in the responses to 
the two classes of stimuli is evidence of guilt. 
Thus, an S who manages by self-stimulation 
to produce large GSRs to the irrelevant items 
is betrayed by the fact that his responses to 
the relevant items are consistently smaller. 


Method 


Ss used in this experiment were 49 male college 
students who were assigned at random to four 
groups. Those in Group 1 (13 Ss) were required to 
enact two mock crimes in sequence, a “murder” and 
a “theft.” For the Murder enactment, S was taken 
to the second floor of the building and required to 
knock on the door of one of the offices. The door 
was opened by an assistant who, after some prelimi- 
nary conversation, invited S to play a hand of poker, 
which was thereupon dealt out, the assistant getting 
the better hand. Remarking that S now owed him 
a hundred dollars, the assistant then walked over to 
stand looking out the window. Taking a weapon 
from his pocket, S went through the motions of 
killing the assistant, hid the weapon in a drawer of 
the desk, and left the office. 

In the Theft enactment, S had to idle near the 
doorway of a different office until the occupant, a 
woman, left it to go into the washroom. S then 
hurriedly entered and riffled through the desk calen- 
dar until he found a page on which his own name 
had been entered. He erased the name and then 
searched through the desk until he found the article 
(e.g., a watch) which he had been instructed to 
“steal.” Leaving the office, he hid the stolen prop- 
erty in a locker in the hallway. 

As already mentioned, Ss in Group 1 enacted both 
of these mock crimes, in random sequence. Those 
in Group 2 enacted only the Murder, those in Group 
3 only the Theft, and those in Group 4 were exposed 
to neither of the crimes. The next step was for S 
to be turned over to another E for interrogation. E 
was not informed to which group S belonged. S 
was seated in the interrogation room, GSR elec- 
trodes attached to his dominant hand, shocking elec- 
trodes to his other hand, a blindfold put over his 
eyes and a pair of headphones adjusted to his ears. 
E was located with the apparatus in an adjoining 
room and spoke to S via a microphone. 

Each S was told that he was to be questioned in 
relation to two crimes. He was instructed to listen 
to each question but not to reply to any of them. 
He was told that each question consisted of several 
parts and that if, at the end of any question, E felt 
that the physiological response (GSR) indicated guilt, 
then S would be given an electric shock. The shock 


385 


386 


nstrated, most Ss finding it to be 
ie Becta) (the shock was the discharge of a 
demi. capacitor, charged to 300 v., through 8 oe 
diameter electrodes on the fingerprint aroa o = 
first and third fingers). In fact, ammagpechive or t e 
particular S’s response, the shock was always = 
following the completion of the GSR to the last par 
of Questions 2, 3, and 5 of the Murder list and a 
tions 1, 3, and 4 of the Theft list. {the pilrpose of 
the shock was merely to increase S's general anxiety 
level and increase to some extent his motivation not 
to give a guilty record and thus to create a situation 
resembling a little more that of real criminal interro- 
i interrogation lists were standard and each 
consisted of six multiple-choice-type questions. E 
first read the question and then read each of the 
short alternative answers, allowing Sufficient time 
after each for GSR activity to dissipate. One al- 
ternative for each question was relevant for a given 
S. Two of the six Murder questions were as follows: 


(1) If you are the murderer, you will know 
that there was an unusual object present in the 
murder room. Was it (a) a record (b) an easel 
(c) a candy box (d) a chess set?; (2) The mur- 
derer hid the weapon in one of the drawers of a 
desk. Which drawer was it? Was it the (a) 
upper left (b) lower right (c) upper right (d) 
middle (e) lower left? 


Two of the six Theft questions were as follows: 


(1) If you are the thief, you will know where 
the desk was located in the office in which the 
theft occurred. Was it (a) on the left (6) in 
front (c) on the right?; (2) The thief hid what 
he had stolen. Where did he hide it? Was it 
(a) in the men’s room (b) on the coat rack (c) 


in the office (d) on the window sill (e) in the 
locker? 


The number of alternatives averaged 4.67 in the 
Murder list and 5.0 in the Theft list. Questions 2, 
3, and 6 in the Murder list and 2, 3, 4, and 6 in the 
Theft list were “double-blind,” that is, the relevant 
or guilty alternative was varied at random from § 
to S so that E did not know which was which, 
Questions were always given in the same order 
within a list but whether the Murder or Theft list 
was given first was determined at random. 

Scoring was simple, a priori, and objective. An 
S's GSRs to the several alternatives in a given ques- 
tion were ranked in order of amplitude. If his larg- 
est response was to the relevant alternative, he was 
given a score of 2 on that question. If his second 
largest response was to the relevant alternative, he 
was given a score of 1. Thus, a perfect Innocent 
score was 0 and a perfect Guilty score was 12, for 
both lists. 


Results 


If all scores of 6 or less are classified “inno- 


cent” and all those over six “guilty,” then 


David T. Lykken 


four Ss from Group 1 and one from Group : 
would be misclassified as to group, a total A 
5 misses out of 49, or 89.8% hits. Casi 
ing the two crimes separately, there sae a 
interrogations of Guilty Ss (the 24 Ss r i 
Groups 2 and 3 plus the 13 Ss from ree 
who were Guilty of both crimes), and n X 
terrogations of Innocent Ss (the 24 Ss fr ‘i 
Groups 2 and 3 plus the 12 Ss from ot ij 
who were Innocent of both crimes). For A 
four of the 50 interrogations of Guilty Ss a 
sulted in scores of 7 or higher, all of a 
interrogations of Innocent Ss gave scores 0 iy 
or lower, a total of 93.997 correct classifica 
tion. 


Discussion 


It should be emphasized that these results 
by no means represent the upper limit of vie 
lidity that could be achieved with the sor 
and objective guilty knowledge teonim 
On the other hand, one must consider whet y 
results from such a laboratory study Bia 
safely be extrapolated to the real life E 
nal interrogation situation. Some of the Po e 
that might be raised in this connection 4 
discussed below. Jd be 

1. All Ss in the real life situation wou me: 
more emotionally involved in the erent 
The use of electric shock in the experim ne- 
was intended to make the situations 50" 
what more comparable in this respect, re- 
certainly an important difference still o 
mained. However, because of the nature in 
the guilty knowledge method, an increasé in- 
general emotional reactivity in either P iifect 
nocent or a guilty S does not in itself a ple 
the validity of the test. As long as S is aP 
to comprehend the situation and to ee ep 
more intensely to a question having some mos 
cial significance for him than he does t° om- 
of the questions, the method is not jnno” 
promised in its ability to differentiate } 
cence from guilt. t par 

2. The Ss in this experiment were pA hod 
ticularly sophisticated concerning the vate 
being used and were not strongly aes } 
if guilty, to try to defeat the test. The" e 


js 

eiv! d 
t celv™. 
no way in which an S, once he has pe e p 
a stimulus, can inhibit what woul it 


ery 
normal GSR to that stimulus. Bowe ae 
is possible to try to defeat the guilty 


————— OO 


The GSR inthe Detection of Guilt 387 


edge type of test by producing intentional 
or artificial responses to the nonsignificant 
stimuli so as to reduce the relative size of 
the involuntary guilty response and so con- 
fuse the record. Artificial GSRs can be pro- 
duced in various ways by a sophisticated S. 
However, because the GSR is peculiar in that 
it does not produce any proprioceptive stimu- 
lation, it is not possible for a subject to know 
Whether his attempt to produce a deliberate 
Tesponse has been successful and it is cer- 
tainly impossible for him to deliberately pro- 
duce responses of controlled sizes. Still, it 
remains to be experimentally determined to 
What extent a sophisticated, motivated S can 
Confuse in this way a guilty knowledge rec- 
ord. A second experiment is in progress 
Which is concerned with this problem. 

3. The Ss in this experiment were college 
Students and hardly representative of the av- 
erage run of criminal suspects; perhaps a pro- 
Portion of the latter would not respond “nor- 
mally” in such a test. Again, a final answer 
to the question suggested can only be pro- 
Vided by an appropriate experiment. The 
literature of lie detection does include refer- 
ences to the problem of the nonreacting S. 
However, in contrast to lie detection pro- 
Cedures, the guilty knowledge method, which 
Uses each § as his own control, does not re- 
quire that the responses of the guilty S be 
Comparable in any way to those of the inno- 
Cent, but merely that the guilty S respond 
differently to some of the items than he does 
to others—something which the innocent S 
Cannot consistently do. It is interesting to 
Note in this connection that one of the Ss in 
Group 1 was a Hungarian expatriate who, 
While engaged in underground activities sev- 
eral years earlier, had been arrested and sub- 
Jected to intensive interrogation by Russian 
Secret police. Although he had been success- 
Ul then in maintaining his forged identity 
and in convincing the MVD that he was ig- 
norant of any underground activities, he was 
€asily identified by the guilty knowledge test 
as being guilty of both murder and theft! 

4. The Ss in this experiment spent only a 
W minutes in the mock crime situations and 
€refore had little opportunity to note the 
tails of the situation which was used for 
€ guilty knowledge test. It was no surprise 


fe 


to find that many Ss who were guilty of the 
murder, for example, reported after the inter- 
rogation that they had not noticed the map 
on the wall of the Murder room, or the chess 
set on the bookcase, or etc. Real life crime 
situations would obviously vary enormously 
among themselves in this respect. ^A suspect 
who is accused of having robbed a series of 
liquor stores can safely be assumed to know, 
if he is guilty, a number of things which an 
innocent person would not, such as the loca- 
tions and appearances of the stores, the 
amounts taken, the appearance of the various 
victims, certain striking facts about what was 
said or done during the robberies, and so on. 
On the other hand, the question at issue might 
be which one of a group of armed thieves fired 
a fatal shot. In such a case, the guilty indi- 
vidual would not be expected to possess any 
guilty knowledge not shared by his confeder- 
ates and/or the other suspects, and the pres- 
ent method would not be of any use. (Obvi- 
ously, each suspect might be expected to give 
a larger response to the name of the guilty 
one than to the other names, his own ex: 
cluded. Such consistency would, if found, 
rather clearly identify the guilty individual. 
However, such a method cannot have the cer- 
tainty of the guilty knowledge technique.) 

It seems reasonable to suppose that many 
real life crimes would lend themselves to the 
use of the guilty knowledge method, keeping 
in mind that trivial and seemingly irrelevant 
details are as useful as interrogation stimuli 
as are the more obvious facts, such as the 
weapon used, the article stolen, etc., which 
might be passed on to innocent suspects by 
the newspapers or the arresting officers and 
thereby made useless for this purpose. It 
also seems reasonable that, in such cases, the 
guilty person might be expected to have a 
wider range of guilty knowledge than was in- 
duced in the subjects of the present experi- 
ment. 


ince only about 15 min. of interroga- 
tion time and only six questions were used in 
the interrogation for each of the mock crimes, 
it can be assumed that a higher validity could 
easily be achieved by a longer interrogation, 
using questions more than once and using a 
greater variety of questions. With only six 
questions and the simple scoring system used 


388 


here, about one S in 50 might be classified 
guilty though actually innocent, due to chance 
fluctuations. The probability of such false- 
positive misclassification decreases rapidly as 
the number of questions is increased. Thus, 
with only 10 questions, having five alterna- 
tives each, less than 3.28% of innocent Ss 
will show guilty responses on more than four 
questions and less than 0.64% on more than 
five. (These figures assume that the ques- 
tions are well enough constructed so that the 
probability of an innocent S reacting most 
strongly to the relevant alternative is about 
equal to that for the mean of the other al- 
ternatives.) 

6. The scoring system used in this experi- 
ment was simple and did not involve any at- 
tempt to defend against the possibility of 
S making deliberate responses in order to de- 
feat the test. The guilty knowledge method 
does not require one to assume that the guilty 
S will tend to give larger reactions to the 
relevant items, although the present scoring 
system did require this result. All that need 
be assumed is that the guilty S will react 
differently to the relevant items, as a group, 
than he does to the irrelevant alternatives, 
The only way in which an S can behave con- 
sistently differently with respect to the set 
of relevant alternatives than he does to the 
others is by having some way of distinguish- 
ing these alternatives from the rest, ie; by 
having the guilty knowledge which declares 


him to be guilty in fact, In a situation where 
active attempts by a sophisticated S to de- 
feat the test are to be ex 
subtle scoring system th 


should yield a higher validity. 


David T. Lykken 


Summary 


Forty-nine male college students, after ran- 
dom assortment into four groups, were aA 
quired to enact one, both, or neither of By 
mock crimes. All were then given a ay 
knowledge test, employing the GSR, vie 
used six standard questions relating to each 
of the two crimes. A simple, objective, E 
a priori scoring system was used to determi 
guilt. Forty-four or 89.8% of the Ss ba 
assigned to their correct group, coup © 
chance expectancy of 259%. Considering t “i 
crimes separately, all Ss innocent of a el 
were correctly classified, while 44 of 50 inte s 
rogations of Guilty Ss gave guilty classifica- 
tions, a total of 93.9% correct classificatio 
against a chance expectancy of 50%. as 

Lie detection, requiring unreasonable m 
sumptions about the consistency of Le 
logical response patterns, has not been Lae: 
by acceptable research to have the high V“ 
lidity claimed for it and which is ae 
for its useful application. Detection of su! a 
knowledge, while less widely applicable, }§ fe 
More reasonable, objective, and generally a 
fensible technique and is demonstrably Ars 
pable of very high validity in those situati 
where it can be used, 


REFERENCES i 
: Burke 
Ettson, D. G., Davis, R, C., SaLtzman, I. Jo 


$; cep” 
GT. A report of research on detection of de 


: f Nave 
tion. Contract N6onr-18011 with Office 0 
Research, 1952, 


york 
Marston, W. M. The lie detector test. NeW 
Smith, 1938, 


jon 
Summers, W, G. Science can get the confess 
Fordham Law Rev., 5, 335-354. 


(Received February 17, 1959) 


i 


Journal of Applied Psychology 
9 


Vol. 43, No. 6, 195 


DURATION OF MOVEMENTS IN A DIAL SETTING 
TASK AS A FUNCTION OF THE PRECISION 
OF MANIPULATION * 


J. RICHARD SIMON anp BETTY PEARL SIMON 


State University of Iowa 


This study deals with the interrelation of 
the component movements in a dial setting 
task. The variable manipulated is the pre- 
cision or accuracy required to set the dials. 

Psychologists and engineers have long rec- 
ognized that an important factor in determin- 
ing the duration of a task is the precision 
which the task demands of the operator. 
Nevertheless, there has been little systematic 
research on the effects of precision as a vari- 
able on movement times. Predetermined mo- 
tion time systems (Maynard, 1956, Sec. 4) 
Consider the precision, or manual control, or 
difficulty of a movement component in set- 
ting the time standard for that component. 
Little is known, however, of the effects which 
variation in the precision requirements of one 
part of a movement have on the durations of 
other parts of the same movement. 

In the present study, the precision required 
to adjust each of two dials on a simplified 
control panel was systematically varied while 
all other characteristics of the control move- 
ment were held constant. In this way, the 
effects of precision of manipulation on the 
durations of other parts of the motion cycle 
Were investigated. Though the results of this 
research may have practical applications in 
the setting of time standards, this was not its 
Primary objective. This study is one of a 
Series (Davis, Wehrkamp, & Smith, 1951; 
Harris & Smith, 1954; Rubin, Von Trebra, & 
Smith, 1952; Simon & Smader, 1955; Simon, 
1956; Wehrkamp & Smith, 1952) aimed at 
8aining a more complete understanding of hu- 
Man manual movements through the system- 
atic identification and delineation of the vari- 


ables which affect movement duration. 
ees, 

*Data for this paper were collected during 1955- 
56 while the senior author was a Fulbright research 
Scholar at the University of Cambridge, England. 
„The authors are indebted to Alan T. Welford for 
1S cooperation and assistance and to Karl U. Smith 
Or the loan of the recording apparatus. 


Method 


Apparatus. Figure 1 pictures the main elements of 
the dial setting task. S is shown seated in front of 
the control panel adjusting one of the two dials. His 
task was simply to adjust the dials one after the 
other as rapidly as possible. An electronic motion 
analyzer, described previously (Simon & Smader, 
1955) was used to record separately and automati- 
cally the durations of the parts of the dial setting 
task. By grasping and turning the right dial, S com- 
pletes a circuit and current is supplied to a precision 
timer which records the duration of that manipula- 
tion to .01 sec. When S releases the dial and moves 
toward the left dial, the first clock stops and a second 
clock begins which records the duration of the right 
to left travel movement. S’s contact with the left 
dial stops the travel clock and starts a third clock 
which records the time taken to adjust the left dial. 
A fourth clock similarly records the duration of the 
left to right travel movement. The duration of each 
successive manipulation or travel movement is ac- 
cumulated on one of these four timers. 

The two types of dials used in the experiment are 
also pictured in Fig. 1. Two dials of Type A and 
two dials of Type B were used. The diameter of 


FINE ADJUSTMENT GROSS ADJUSTMENT 


O G 


DIAL TYPEA DIALTYPEB 


SIGNAL LIGHTS 
oS 


Fic. 1. Sketch of dial setting task. 


389 


390 J. Richard Simon and Betty Pearl Simon 


each dial measured 4 in. The diameter of the 
knurled knob which S grasped to manipulate the 
dial was 13 in. and extended ł in. from the dial face. 
Considering the 12 o’clock position as 0°, the four 
white marks on each dial face were placed at 0°, 70°, 
170°, and 285°. The white marks on the face of 
Dial Type A were narrow (2° of arc). The marks 
on the face of Dial Type B were wide (20° of arc). 
Aligning a mark on the face of Dial Type A with 
the target mark (2°) above the dial required S to 
make a fine adjustment, while aligning a mark on 
the face of Dial Type B with the target mark (2°) 
required only a gross adjustment. 

All dials and mountings were precision machined 
and aside from the difference between Type A and 
Type B in the width of the marks on their faces, 
the four dials were alike. The dials rotated easily, 
but there was sufficient friction so that S’s release of 
the dial after making a setting would not throw the 
setting out of line. 

The S’s task was to set the dials alternately, first 
the right dial and then the left dial, etc., until each 
dial had been set 12 times. S used his right hand 
only. To set a dial he had to rotate it approxi- 
mately + turn in order to align the next mark on 
the dial face with the target mark at the 12 o’clock 
position above the dial. During a trial, each dial, 
whether Type A or Type B, was rotated through 
three complete turns. Thus, the precision of the 
manipulative part of the task was varied without 
altering the extent of the manipulation. To begin a 
trial, S briefly tapped the left dial and then moved 
to adjust the right dial. There were, then, 12 left 


LEFT DIAL RIGHT DIAL 
CONDITION: 

I —T ıı — 
<—Te— 

Mi M2 
m —T 3 — 
<——T4— 

M3 M4 
m —T5— 
<—T6— 

M5 M6 
w —T7—>, 
<—Tsg — 

M7 Me 


Fic. 2. The four experimenta iti 
, : X al condit: 
investigate the effects T 


of precision on 
durations. movement 


to right travel movements and 12 right to left travel 
movements during each trial. 

An essential portion of the apparatus was the light 
above each dial which signaled a correct setting. A 
circuit was arranged so that when a mark on the 
dial face was correctly aligned with the target mark 
above the dial, the signal light would go on. Align- 
ing one dial automatically caused the light above the 
other dial to go off. The system of signal lights per- 
mitted E to maintain a constant standard of accu- 
racy for all Ss since Ss were instructed to “work as 
fast as you can” but “always be sure the light is on 
before moving to set the next dial.” 

The dial setting task was designed to involve visual 
cues primarily. There were no perceptible tactual 
cues to aid the operator. The relays in the signal 
light circuit were housed in a soundproof box to 
eliminate auditory cues. The four marks on the dials 
were arranged so that, for each setting, a different 
amount of rotation was required to bring the dial 
into alignment. Since S did not replicate the same 
rotation each time, kinesthetic cues were minimized. 

Procedure. The experimental conditions consisted 
of the four combinations of fine and gross settings 
pictured in Fig. 2. Condition I involved a gross 
manipulation of both dials. Condition II involved 
a fine manipulation of the left dial and a gross ma- 
nipulation of the right dial. Condition III involved 
a gross manipulation of the left dial and a fine ma- 
nipulation of the right dial. Condition IV involved 
a fine manipulation of both dials, 

e per right-handed naval enlisted men (mean 
fae SD 3.9) served as Ss. Each S reported 0? 
our days during a five-day period. On the first 
day each S was assigned to one of the 24 possible 
pniiation of the four experimental conditions, 2” 
Pea ea to perform in that sequence during the 
S owing three sessions. All Ss, then, performed 0”? 
ea tal conditions during each of the four 
ap kis si consisted of 12 trials, three 0” 
pleted th ne four experimental conditions. S COM- 
Et e three trials on one condition before Pc 
g on the next condition in his sequence. 


Predictions 

He ok basis of previous experimental ye 
(a) i ìmon, 1956) which indicated that: 
em oe the perceptual loading of a 
mir nt increases its duration and (0) 1° 
asing the perceptual loading of one part ° 

a task increases the durations of certain other 
Parts of the task; and assuming that increas- 
ing the precision requirements of a moveme” 
Fae method of increasing its perceptt@ 
ing, several specific predictions were ™4 e 

_ 1. Durations of travel movements will P° 
influenced by the precision of the manipula 
tion which precedes the travel. With trav? 


= 


Duration of Movements in a Dial Setting Task 


Table 1 


Mean Duration (Seconds) of Parts of Dial Setting Task under Four Experimental Conditions 
for Four Practice Sessions 


Component- Sessions 
Experimental Condition 
Parts of Task Condition Code 1 2 3 4 
Latte maita I Ti 455 447 444 4.52 
Il Ts 4.97 4.80 4.81 4.83 
II T 5.18 493 481 485 
IV T: 5.46 5.09 SLL 5.03 
Right to left travel I Tə 4.47 4.47 4.42 4.53 
II T: 5.08 4.77 4.69 4.80 
III Ti 5.09 5.01 4.95 5.02 
IV Ts 5.44 5.16 5.01 5.05 
Left dial manipulation I Mı 8.89 8.60 7.89 8.09 
I M; 17.36 15.88 15.68 15.37 
Ul Ms 10.37 9.24 833 8.26 
IV M; 17.87 16.80 16.11 16,01 
Right dial manipulation I Ma 8.88 8.53 7.91 8.13 
II My 10.46 9.04 8.43 8.08 
TII Ms 19.69 18.31 17.40 17.16 
IV Ms 20.36 18.16 17.44 17.82 
direction and precision of the subsequent ma- Results 


nipulation held constant, travel movements 
will be slower when preceded by a fine ma- 
nipulation than when preceded by a gross 
Manipulation. In terms of the code notations 
in Fig, 2, the following relations should be 
Observed: Ty > Tı, Tr > Ts, Ts > Ts, and 
T> Ty. 
_ 2. Durations of travel movements will be 
influenced by the precision of the manipula- 
tion which follows the travel. With travel 
direction and precision of the preceding ma- 
Nipulation held constant, travel movements 
ward a fine manipulation will be slower than 
trave] movements toward a gross manipula- 
tion, Again in terms of Fig. 2, the following 
telations should be observed: T; > Tı, Tz > 
% Ty > To, and Ts > To. 

3. Since precision of manipulations either 
Preceding or following a travel movement will 

ect the duration of the travel, movements 
tween two fine manipulations will be slower 
„an movements between two gross manipula- 


ns. In terms of Fig. 2, T; > Ti and 
> Ta 


The performance of 24 Ss during their 
fourth session was analyzed to determine the 
effects of precision as a variable on the dura- 
tions of the four parts of the dial setting task. 
For each S, a median time was determined 
for each part of the task under each experi- 
mental condition. 

Four separate analyses of variance were 
performed,” one for each part of the task; 
viz., left dial manipulation, right dial ma- 
nipulation, left to right travel, and right to 
left travel. These analyses indicated that the 
experimental conditions produced significant 
(p < .01) variations in the durations of all 
four parts of the task. 


2 Summaries of the analyses of variance have been 
deposited with the American Documentation Insti- 
tute. Order Document No. 6024 from ADI Aux- 
iliary Publications Project, Photoduplication Service, 
Library of Congress, Washington 25, D. C., remit- 
ting in advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to Chief, Photo- 
duplication Service, Library of Congress. 


Table 2 


Predicted Relationships between Movement Durations 
and Tests of Observed Differences 


Observed 
Difference 
Between 
Means 
t Test Prediction (Seconds) t 
1 T:> T .31 6.22* 
2 T:> T; AS 2.98% 
3 Ts > Ts 49 5.77* 
+ Ts>T, 25 4.04* 
5 Te Di 33 5.38* 
6 T:> T: .20 4.36* 
7 Ta Te 27 4.23* 
8 T:> Ti 03 64 
9 T:> T; isi 8.50* 
10 T:> T: 52 6.52* 
E b <01. 


a lf the error rate per experiment (see Ryan) is set at .01, 
this £ is not significant. The rest of the starred values are 
significant in terms of error rate per experiment as well 
rate per comparison, 


as error 


Duration of Travel Movements 


Table 1 presents the mean durations of the 
parts of the task under each experimental con- 
dition over the four practice sessions, Using 
scores from the fourth session, 10 ¢ tests be- 
tween correlated means were computed in or- 
der to test the specific predictions regarding 
the effects of varying precision on travel 
movement durations. 

Table 2 summarizes the results of these 
tests. The first prediction stated that with 
travel direction and precision of the subse- 
quent manipulation held constant, travel 
movements will be slower when preceded by 
a fine manipulation than when preceded by a 
gross manipulation. This prediction was veri- 
fied by ¢ tests No. 1-4. 

The second prediction stated that with 
travel direction and precision of the preced- 
ing manipulation held constant, travel move- 

ments toward a fine manipulation will be 
slower than travel movements toward a gross 
manipulation. This prediction was verified 
by £ tests No. 5, 6, and 7 but was not sup- 
ported by 8, which was the right to left 
movement preceded by a fine manipulation. 

The third prediction stated that move- 


J. Richard Simon and Betty Pearl Simon 


ments between two fine manipulations will be 
slower than movements between two gon 
manipulations. This prediction was verifie 
by ¢ tests No. 9 and 10. 


Duration of Manipulative Movements 


Table 1 shows that setting Dial Typer 
(fine manipulation) required about twice ae 
much time as setting Dial Type B (gross a 
nipulation). The £ tests of M4 vs. Me e 
Ms vs. Mı on Session 4 indicated that of 
duration of the gross manipulation was re 
affected by being paired with a fine manipu a 
tion rather than another gross manipulatiot: 
During the earlier sessions, however, the ae 
manipulation was significantly slowed W i 
paired with a fine manipulation. A compa t- 
son of M; and M; suggests that a fine i a 
ment is faster (p < .05) when paired wit 4 
gross adjustment than when paired with i 
fine adjustment. This relationship, howev®™ 
was not substantiated by the Ms vs. Me com 
parison. 


Changes with Practice 


The major effects of the experimental a 
ditions on the durations of the mover 
components noted during Session 4 appen i 
consistently over the first three practice 5 
sions as well. The average decrease 17 the 
duration of the manipulative portions © 396» 
task from Session 1 to Session 4 was 1 
while the duration of the travel movema . 
decreased only 4% over the same aie 
There appeared to be no tendency fO" ith 
precise manipulations to improve more t 
practice than the gross manipulations. ffect 
the contrary, the most marked practice © un- 
was observed for the gross manipulation ina 
der the condition where the other term! 
manipulation involved a fine adjustment- 


Discussion 


he 
This study clearly demonstrates that zen 
time required by operators to move be 
two adjustments in a repetitive dial 56%% 
task depends on the precision of (ho a 
justments. Travel movements followi”? „n 
fine adjustment are significantly slowe * fp 
movements following a gross adjustme®*: 


et tine 


—_— ee 


Duration of Movements 


general, travel movements toward a fine ad- 
justment are significantly slower than move- 
ments toward a gross adjustment. Thus, an 
increase in travel time is associated with an 
increase in the precision of the adjustment 
either preceding or following the travel. 

It is important to point out that variation 
in precision was accomplished without alter- 
ing in any observable way the make-up of the 
travel movement between the dials. In other 
tasks used to investigate the effects of toler- 
ance requirements on movement times, the 
character of the travel movement is usually 
modified in that the positioning necessary to 
perform the terminal manipulation is changed. 
Examples of this latter type of task would be 
tapping in a target area where the size of the 
target is varied (Fitts, 1954) or assembling 
pegs into holes where the difference between 
peg size and hole size is changed (Maynard, 
1956, Sec. 4, p. 95). .In the present study, all 
movements were between objects of constant 
size in a fixed location. It is difficult to see 
how the slowing of travel which accompanied 
increased precision of manipulation could be 
related to changed requirements for position- 
ing. 

If the content of the travel movement was 
not altered, i.e., no elements added or changed, 
how are the present results to be explained? 
It appears that the speed of an operator’s 
control movements are determined not only 
by the specific requirements of the individual 
movement components but by the character- 
istics of the task as a whole. This study 
points out the close interrelation between 
parts of a work cycle. It is an oversimplifi- 
cation to conceive of a task as an additive 
combination of separate and independent ele- 
ments. 

Suppose that the standard time allowance 
for the travel part of a task was derived from 
a situation where the operator moved be- 
tween two gross adjustments. How much 
would this time allowance be in error if it 
Were applied to other situations in which the 
Precision requirements of the manipulative 
Portions of the task were different? The pres- 
ent study indicates that the duration of the 
travel movement increases 8% when the pre- 
cision requirements of one of the terminal 


in a Dial Setting Task 393 
manipulations is increased and that the dura- 
tion of this same basic travel movement in- 
creases 11° when the precision requirements 
of bot% terminal manipulations are increased. 
These differences, though small, are highly 
reliable and statistically significant. Whether, 
with the present state of the art of industrial 
time study practices, these statistically sig- 
nificant differences are of practical signifi- 
cance is another question. 


Summary 


This study was concerned with the effects 
of precision as a variable on movement dura- 
tion. Ss were required to adjust alternately 
each of two dials on a control panel. The 
precision required to adjust each dial was 
systematically varied and the effects of this 
variation on the durations of four parts of 
the control movement were determined. An 
electronic motion analyzer recorded separately 
and automatically the durations of each part 
of the task: i.e., the two dial adjustments and 
the two travel movements. 

Results clearly demonstrated that the time 
taken by operators to move between adjust- 
ments depended on the precision require- 
ments of those adjustments. Travel move- 
ments following a fine adjustment were slower 
than movements following a gross adjustment, 
and, in general, travel movements toward a 
fine adjustment were slower than movements 
toward a gross adjustment. These findings 
indicate that the speed of control movements 
are determined not only by the content of in- 
dividual movement components but by the 
over-all characteristics of the task. Results 
provide additional evidence to refute the con- 
cept that a work cycle consists of an additive 
combination of independent elements. 


Received February 17, 1959. 


REFERENCES 


Davis, R, WenrKamp, R, & Smita, K. U. Dimen- 
sional analysis of motion: I. Effects of laterality 
and movement direction. J. appl. Psychol, 1951, 
35, 363-366. 

Fitts, P. M. The information capacity of the hu- 
man motor system in controlling the amplitude of 
movement. J. exp. Psychol., 1954, 47, 381-391. 


394 


Harris, S. J., & Sautu, K. U. Dimensional analysis 
of motion: VII. Extent and direction of manipula- 
tive movements as factors in defining motions. J. 
appl. Psychol., 1954, 38, 126-130. 

Maynarp, H. B. (Ed.) Industrial engineering hand- 
book. New York: McGraw-Hill, 1956. 

Rusry, G., Von Tresra, Patricia, & Situ, K. U. 
Dimensional analysis of motion: III. Complexity 
of movement pattern. J. appl. Psychol., 1952, 36, 
272-276. 

Ryay, T. A. Multiple comparisons in psychological 
research. Psychol., Bull., 1959, 56, 26-47. 


J. Richard Simon and Betty Pearl Simon 


Smiox, J. R. The duration of movement compo- 
nents in a repetitive task as a function of the 
locus of a perceptual cue. J. appl. Psychol, 1956, 
40, 295-301. 

Simoy, J. Rọ, & Saaper, R. C. Dimensional analy- 
sis of motion: VIII. The role of visual discrimina- 
tion in motion cycles. J. appl. Psychol, 1955, 39 

5-10. 

Wenrkamp, R. A, & Samiti, K. U. Dimensional 
analysis of motion: II. Travel distance effects. J+ 
appl. Psychol., 1952, 36, 201-206. 


= 


Journal of Applied Psycholo: 
Vol. 43, No. 6, 1959 0? 


ON THE EQUIVALENCE OF CLINICAL AND 
STATISTICAL METHODS: 


DANIEL SYDIAHA 


University of Saskatchewan 


This paper seeks to resolve an inconsist- 
ency in Meehl’s (1954) analysis of statisti- 
cal vs. clinical methods of assessment. (For 
Purposes of this paper, the term “statistical 
method” refers to the arithmetic combination 
of data by means of an equation or table. 
“Clinical method” refers to the process where- 
by a judge makes a decision or diagnosis after 
“reflecting upon” all of the alleged relevant 
information at his disposal. This process may 
be said to be nonmechanical or informal 
(Meehl, 1954, p. 16) and it may involve in- 
tuition on the part of the clinician to a greater 
or lesser degree.) 

On the one hand, Meehl contends that of 
the two assessments, the clinical one is based 
on more information since it typically in- 
cludes information which has been described 
as unsystematic, qualitative, and idiosyn- 
cratic, in addition to the systematic infor- 
mation which is common to both. Repeat- 
ing Meehl's illustration, a statistical clerk is 
unable to organize this unsystematic infor- 
mation and incorporate it into a decision or 
diagnosis, as a clinician does. On the other 
hand, Meehl’s summary of the empirical evi- 
dence suggests that clinical predictions are 
no better than statistical ones. 

There are at least two ways of resolving 
this inconsistency: 

Assumption A: Meehl’s contention may be 
illusory, i.e., the clinician may have the mis- 
taken belief that unsystematic information is 
being incorporated into his decision, whereas 
in fact such information may contribute noth- 


1 Based on a thesis submitted to McGill University 
in partial fulfillment of the requirements for the 
Ph.D, degree, 1958. The author gratefully acknowl- 
edges the assistance of members of the Canadian 

rmy Personnel Selection Service, including its direc- 
tor, W, R. N. Blair. E. C. Webster directed the re- 
Search, and John Kenyon assisted with computa- 
tions.’ Financial assistance was provided by the 

Sychiatric Services Branch, Saskatchewan Depart- 

ent of Public Health, and by the Defence Research 


Card of Canada, Grant No. 9435-53 to E. C. 
Webster, 


ing new, but may merely confirm a set estab- 
lished by previously scanned systematic in- 
formation. It would follow from this that 
statistical and clinical judgments would be 
equivalent, i.e., both methods would generate 
the same decisions. 

Assumption B: Unsystematic information 
might be subject to misinterpretation to a 
greater extent than is systematic information, 
ie., clinical judgments may be subject to 
such factors as interclinician bias, halo effects, 
and intraclinician inconsistency to such an ex- 
tent as to effectively contribute nothing to 
the prediction. It would follow from this 
that the two methods need not generate the 
same decisions, and the clinical decision 
would have a greater degree of error associ- 
ated with it. The purpose of this investiga- 
tion was to determine which of these two 
alternative assumptions was more plausible. 

The method involved developing two ra- 
tional, explicit decision making models, one a 
so-called statistical model, and the second a 
so-called clinical model, whose formal prop- 
erties approximated, at least in part, the 
“real” properties of the two respective assess- 
ment methods. Both models were designed 
to generate decisions on the basis of the fol- 
lowing considerations: 

(1) The “raw data” upon which the model 
decisions were based was provided by inter- 
viewers regularly engaged in making clinical 
judgments; in this instance Canadian Army 
personnel officers selecting recruits for the 
Canadian Army, regular force. 

(2) Both models were designed to maxi- 
mize the correlation between model decisions 
and the actual decisions made by the inter- 
viewers. In effect, the adequacy of the mod- 
els in accounting for the interviewers’ deci- 
sions was being investigated. 

The following hypotheses were advanced, 
consistent both with Meehl’s analysis and 
with Assumption B outlined above: 


395 


396 


H,: Decisions generated by a clinical deci- 
sion making model correspond more 
closely to real decisions than do decisions 
generated by a statistical decision mak- 
ing model. 

Hə: Decisions generated by clinical and sta- 

~ tistical decision making models are not 
equivalent, i.e., are imperfectly corre- 
lated. 

Hy: Decisions generated by a clinical deci- 
sion making model are more unreliable 
than are decisions generated by a sta- 
tistical decision making model, i.e., they 
are subject to interclinician bias, halo 
effect, and intraclinician inconsistency. 


Attention should be drawn to the definition 
of the word “clinical” in the opening para- 
graph. According to Meehl (p. 15) “clini- 
cal” refers both to data, i.e., nonsystematic 
kinds of information, and to method, i.e., 
nonmechanical methods of combining data. 
(The word “systematic” is considered prefer- 
able to “psychometric” as used by Meehl, 
since much data which is “nonpsychometric” 
is “systematic” and can be combined me- 
chanically. Example: biographical informa- 
tion.) Meehl’s primary interest is in com- 
paring methods applied to the analysis of 
systematic data only. Apart from consider- 
ing the logical status of nonsystematic data 
(Chap. 6, pp. 37-67), he has devoted little 
attention to this aspect of the problem, 

The point of view adopted in this Paper is 
that empirical comparisons of methods should 
include both systematic and nonsystematic 
data. Presumably this failure to consider 
honsystematic data stems from the inability 
to treat such data mechanically, (This pre- 
sumption is supported later in this paper.) 
Consequently, if one is to include nonsys- 
tematic data in the comparison of methods 
one is left with a comparison between sys- 
tematic data treated mechanically (the so- 
called statistical method) and both system- 
atic and nonsystematic data treated nonme- 
chanically (the so-called Clinical method) 
which is the comparison examined in this pa- 
per. Admittedly, there is a confounding of 
data and method (as defined by Meehl) in 
this design. In defense of this design, it is 
to be noted that clinical methods (as defined 


Daniel Sydiaha 


herein) are in widespread use, for any num- 
ber of reasons, and the evaluation of such 
methods, according to the hypotheses out- 
lined above, is deemed desirable. 


Procedure 


Rationale of Decision Making Models. A modified 
Q-sort method of personality description was used 
as the basis for the clinical decision making model. 
(See Stephenson [1949] for an elaboration of the 
relevance of Q sort to clinical methods.) Use of Q 
sort for this purpose was based on the assumption 
that the essential operation involved in clinical as- 
sessment was the preparation of a case report in 
which the assessee is described in terms of dominant 
traits, abilities, motives, etc. The case report, in 
effect, represents the clinician’s explication or de- 
fense of his decision or judgment, Thus the Q sort 
and the case report are similar in that they are 
methods of describing people, and they differ only 
in that the Q sort is more systematic and includes 
a fixed number of descriptive statements. ‘This sY5- 
tematic aspect of the Q sort was desirable for pul 
Poses of this study since it made possible the assess- 
ment of errors of measurement according to Ha out- 
lined above. z 

The discriminant function was used as the basis 
for the statistical decision making model. While the 
use of a linear model rather than curvilinear or in- 
teractive models limits the validity of the results, it 
was selected as being the only known, practicable 
method of combining scores from 26 variables. 
is acknowledged that the use of curvilinear or inten 
active models might drastically alter the conclusions 
drawn in this paper. anal 

For purposes of convenience, the terms “clinica 
Scores” and “statistical scores” will be used in the 
balance of this paper to designate decisions gene! 
ated by clinical and Statistical decision making MOC 
els respectively, The reader is reminded that such 
Scores refer only to the model decisions as define 


above, and not to the actual decisions made by 1° 
terviewers, 


Subjects. 


2 ar 
Each of eight Canadian Army Regu! P 
Force person: 


nel officers interviewed from 14 tO hë 
regular force applicants for “other” ranks in tP 
Canadian Army, ie, ranks other than officers 40° 
noncommissioned officers, Total N= 256. Inte" 
views were conducted under usual Army circum 
stances except for additional information require 
for this Study, and the interviews were sound 
corded. Each case was assessed by only one office 
phg classified the applicant as either an “accept tion 
reject,” and who provided all of the informa 
required for this study. Reject cases include Jater 
applicants recommended for reassessment at @ g- 
date and 7 applicants referred for psychiatrie 5 - 
Sessment. Accept cases included 29 applicants, ian 
sidered only marginally suitable for Army servic he 
The typical induction procedure, following de- 
Personnel officer’s decision, was a review of this 


a a 


Equivalence of Clinical and Statistical Methods 


cision by the commanding officer of the personnel 
depot. If the application was approved by the com- 
manding officer, the recruit applicant was then sworn 
into the Army. The decision of the personnel officer 
Was crucial to the applicant’s acceptance, even though 
the final decision was made by the commanding offi- 
cer, since the personnel officer’s decision was over- 
ruled by his commander very infrequently. 

The cases were divided into three separate groups 
to permit cross-validation of findings. Cases pro- 
vided by Officers A, B, F, and G (N = 38, 50, 50, 
41) were randomly assigned to a criterion group 
(N =89) and a holdout group (N = 90). The re- 
maining cases, i.e., those provided by Officers C, D, 
E, and H (N = 14, 23, 22, 18), made up a second 
holdout group (N=77). (The terms “criterion 
group” and “holdout group” are used to designate 
the samples used in the two stages of cross-valida- 
tion procedures, ie, optimal scoring procedures 
Were developed with the criterion group, and these 
Procedures were then applied to the holdout groups. 
Unless otherwise indicated, only results obtained 
with holdout groups are reported.) 

Clinical Scores. Clinical scores were based on a 
120-item Q sort completed by the officer for each 
applicant at the termination of the interview and 
after he had made his decision. The 120 statements 
Were obtained in a preliminary critical incident sur- 
vey of 141 Army interviews in which personnel offi- 
cers related factors particularly important in leading 
to decisions to accept or reject applicants. Both ob- 
jective facts and subjective impressions were ob- 
tained in a list of 182 incidents, and these were 
Classified into the 120 Q-sort items. The officers’ 
terminology was preserved as much as possible, and 
the content of each item was limited to a single trait 
or attribute. To facilitate the sorting procedure, a 
Printed check list was used in which the officer as- 
Signed the items to a 9-point continuum (unforced 
distribution) ranging from least to most descriptive 
of the applicant. . 

In addition to a description of each applicant, 
tach officer used the check list to describe an ideal 
®pplicant at the conclusion of the study. This 
ideal” represented each officer's judgment of the 
relative importance of the check list items in de- 
Scribing applicants thought most likely to succeed 
in the Army. This was done to test the validity of 
the check list procedure, since if it were valid one 
Would expect accepted applicants to correspond more 
Closely to the ideal than would rejected applicants. 

“congruence” score was calculated for each of the 

56 descriptions, which consisted of a Pearson prod- 

Ct-moment correlation coefficient between each de- 
Scription and the ideal as described by the inter- 
Viewing officer, The magnitude of such congruence 
Scores were compared for accepted and rejected 
Cases, 

Clinical scores were obtained by assigning num- 
Bers to items, ranging from 1 to 9, corresponding a 
b © 9-point continuum, and by combining the nim- 
®rs arithmetically. To maximize the correlation be- 
Ween clinical scores and acceptance-rejection, this 


397 


scoring system was limited to 67 items found to be 
significantly related (.05 level) to acceptance-rejec- 
tion for the criterion sample, using point-biserial 
correlation as the item-analysis procedure. 

The reader may feel that these statistical pro- 
cedures have rendered the clinical decision making 
model somewhat statistical in nature, thus invalidat- 
ing the distinctions made in the opening paragraph 
of this paper. It is to be noted, however, that no 
validity for the model is claimed beyond the assump- 
tion inherent in the model, i.e., it is assumed that 
the Q sort approximates, at least in part, the activi- 
ties of a clinician. To the extent that this assump- 
tion is warranted, the results are valid. Further- 
more, to argue that setting up any artificial clinical 
situation (such as requiring the clinician to perform 
a Q sort) destroys its essential characteristics is to 
render the analysis of clinical decision making proc- 
ess untestable. 

Statistical Scores. Statistical scores were based on 
test and biographical data regularly examined by 
Army personnel officers, namely the Army M test 
(a classification test comprising mechanical, verbal 
and nonverbal intelligence scores), the MMPI, and 
a biographical data sheet. An attempt was also 
made to scale interview content for inclusion in the 
statistical score, in line with Meehl’s argument that 
comparative studies of statistical and clinical judg- 
ments should be based on the same information. A 
method of content analysis was developed and ap- 
plied to the criterion sample of one officer, The 
content items significantly related to the accept- 
reject criterion were then applied to the holdout 
sample for the same officer: the number of items 
which remained significant was below chance ex- 
pectation, and it was assumed that interview con- 
tent could not be scaled. Since the officer chosen 
for analysis was judged to be the most systematic 
interviewer of the eight, no further attempts to scale 
interview content were made. 

Scores corresponding to statistical decisions were 
computed from a multiple-regression equation (dis- 
criminant function) so as to maximize the correla- 
tion between the obtained scores and acceptance- 
rejection for the criterion group. Thirteen items of 
biographical and test data were used, and these were 
selected at random from a total of 26 items which 
were available. (Computational labor involved for 
26 items was considered prohibitive. Subsequent 
analysis tended to justify use of only 13 items in 
that all 26 items combined by a standard score 
method gave comparable results: see Table 1, Col. 3 
and 4.) 

Following Ezekiel (1947) and McNemar (1955), 
selection of items was made on a random basis 
rather than on the basis of item analysis, as was 
done with clinical scores. Ezekiel and McNemar 
have pointed out that for multiple correlation meth- 
ods, item selection tends to capitalize on chance 
fluctuations in test distributions, with the result 
that marked shrinkage occurs in the correlation from 
criterion to holdout samples. Such shrinkage was in 
fact found to occur in this study. 


398 


Daniel Sydiaha 


Table 1 


j iteri istical Sı and 
lations among (a) the Accept-Reject Criterion, (b) Statistical Scores, 
Intercorrelations g 


(c) Clinical Scores, for Holdout Samples of Cases 


2 3 + 
— 7 8 
Tab” E 5 m i a 
a oy prg ** 
13 26 . T oe 7 
S ; le Nab Items Items Vx : f E 
ample 
47 44 19 92 19 7 
A 19 ar u 9 9 9 % 
B 25 s j 22 : 
14 +++ n 14 n 
> 23 .29 —— 23 84 2 3 n 
E 19 31 = 22 78 19 - 
E 25 07 -06 25 68 3 es 
z 19 48 50 21 'g2 19 o 
z 17 A2 = 18 81 1 1 a 
H LG 88 39 38 90 "95 s = 
CHD LETH 73 233) = 77 ‘30 7 
-T Pe int-biserial correlations. : | ae 
ped Product-moment correlations in his sample. Consequently these correlations are either indeterminate 
comparable to the others shown. 


Results 
Validity of Check List Procedure 


Congruence scores were significantly higher 
for accepted than for rejected $ applicants 
(romi = .78 between acceptance-rejection and 
the distribution of congruence scores). It 
was assumed that the check list discriminated 

between accepted and rejected applicants. 


Correlations of Acceptance-Rejection with 
Clinical and Statistical Scores 


Clinical scores correla 


ted with acceptance- 
rejection at a hi 


gher level than did statisti- 
cal scores (see Col. 3 and 6, Table 1). This 
result tended to support H;: decisions gener- 
ated by the clinical model corresponded more 
closely to real decisions than did decisions 
generated by the statistical model. 

Correlation between St. 


atistical and Clinical 
Scores 


Correlations were .36 and -52 for the two 
holdout groups (see Table 1, Col. 8). Thus 
He was supported in that although these cor- 
relations are significantly different from zero, 
they do not approximate 


1.00, and hence 
there was reason to believe that the two sets 
of scores were not equivalent. 


3 ; inical 
Sources of Error Associated with Cli 
Scores 


1. Interofficer bias 


There was no marked tendency for hie 
vidual officers to limit their descriptions aise 
specific Q-sort items since there were nO aur 

å Aifierences among the officers e 
ples in the magnitude of the correlation tion 
tween clinical scores and acceptance-reJec the 
(see Col. 6, Table 1; p> .2 as tested by p 
method described in Snedecor [1948]; i 
154. All correlation differences reported i 
this paper were tested by this method): re- 
other words, the 67 selected Q-sort items > 


lly 
dicted the decisions of all interviewers €902 
well. 


m 

Verification of this result was obtained ft 
two other sources: 

(a) The intercorrelations of A 
cants described by officers were high Cri 
56 to .98; median = 81), which oy the 
that they were looking for approximate 
Same attributes in applicants. g was 

(b) A second set of clinical scat”? 
computed based on the same consider e se! 
as are outlined above, but with a sepa" cers: 
of items selected for each of the four © 


sol 
isio? 
A, B, F, and G. In effect, a separate 


sa a 


| 


Equivalence of Clinical and Statistical Methods 


399 


Table 2 


Correlations between Clinical Scores, as Determined by Two Methods of Item Scoring, 
and the Accept-Reject Criterion 


Items Selected for 


Items Selected for 


Individual Officers Combined Sample 

Sample N N Tpbi N Tpbi 

\ 19 28 90 67 92 

B 25 48 93 67 92 

F 25 18 83 67 68 

G 21 33 78 67 82 
-A+B+F+G 90 878 67 85 


a Mean correlation. 


making model was developed for each officer, 
thus permitting individual officer biases to in- 
fluence the results of this second set of clini- 
cal scores to a maximum degree. The two 
sets of scores obtained, however, were not 
different (see Table 2), which tended to sug- 
gest that interofficer biases were negligible. 
Apparently the criterion variance accounted 
for by items common to all four officers was 
equally well accounted for by the criterion 
variance specific to each officer. 


2. Reliability 


Split-half reliability and internal consist- 
ency of clinical scores were high (see Table 3). 
There were significant differences among ofi- 


cers for the coefficients of homogeneity ob- 
tained (p < .001) but not for the coefficients 
of split-half reliability (p > .1). 


3. Halo effect 


Intercorrelations among ten randomly se- 
lected Q-sort items were low (range from 
—.45 to .25; mean = .016; when items were 
reflected so as to make them uniform in di- 
rection as to favorability, mean = .081). This 
result tended to suggest the absence of halo 
effects and other such “errors of association” 
in which raters attribute unwarranted rela- 
tionships among items (Guilford, 1954, pp. 
278-280). 


Table 3 


Coeflicients of Homogeneity 
Clinical Scores, 


(Kuder-Richardson Formula 20) and Reliability (Split-half) for 
Based on 67 Selected Items 


Coefficient of 


Coefficient of 


Homogeneity Reliability 
“Criterion Holdout Criterion Holdout 
Sample Sample Sample a Sample Sample 
N o 92 94 96 97 
B 94 96 .6 97 
C 768 
D 87 
E 82 
F 85 86 79 81 
G 91 91 78 78 
H OT 
A+B+F+G 93 88 : 
All cases ot 
(A-H) 
a Clinical scores for Officer C are aerei in range since there were no rejected applicants in his sample. 


400 


Sources of Error Associated with Statistical 
Scores 


There was no marked tendency for the in- 
dividual officer samples to be associated with 
specific items of test and biographical data 
since there were no significant differences 
among the officers’ samples in the magnitude 
of the correlation between statistical scores 
and acceptance-rejection (p > .8, see Table 1). 
In other words, the statistical decision mak- 
ing model worked equally well for all officer 
samples. 

The assessment of the reliability of test 
and biographical data was considered un- 
necessary in view of the high reliability co- 
efficients obtained for clinical scores, i.e., H, 
was rejected on grounds that the reliability of 
clinical scores was at least as high as the reli- 
abilities usually obtained with psychological 
tests. 


Discussion 


Hypotheses 1 and 2 were confirmed: the 
clinical decision making model corresponded 
more closely to real decisions than did the 
linear statistical decision making model, and 
the two models were uncorrelated. Meehl’s 
contention that the clinician performs a dis- 
tinctive role of interpreting and classifying 
idiosyncratic information was thus supported. 
Assumption A, i.e., that the clinical incorpora- 
tion of unsystematic data was illusory, w. 
rendered implausible by these results, 

This study failed, however, to reveal any 
substantial degree of error associated with 
clinical decision making models, as might 
have been expected by alternative Assump- 
tion B. This is rather surprising in view of 
Meehl’s review of the evidence, cited earlier 
which indicated that clinical predictions are 
no better than statistical ones. If the result 
of the decision making process (the clinical 
prediction) contains error, then one might ex- 
pect to find sources of error in the decision 
making process itself. 

There are at least five possible explanations 
for this discrepancy between the implied pres- 
ence of error associated with clinical judg- 
ment in the studies reviewed by Meeh] and 
the apparent absence of error asso i 


ge x ciated with 
the clinical scores obtained in this study: 


as 


Daniel Sydiaha 


(a) Halo effects and other errors of association 
may have been present in the data of this study but 
may not have been adequately assessed by means of 
the 10 items selected. 

(b) It is still possible for interviewers to agree 
among themselves as to what traits or attributes are 
important in assessment, and still disagree as to 
standards of acceptance required of an interviewee. 
This would result in differences among interviewers 
in their decisions about the same applicants. 

(c) It is also possible that the criterion variance 
unaccounted for by the clinical decision making 
model used in this study was subject to interviewer 
bias. Since the 67 Q-sort items accounted for ap- 
proximately 64% of the decision variance (the cor 
relation between acceptance-rejection and clinical 
scores was approximately .8) the balance of the vari- 
ance, i.e, 36%, was unaccounted for, and may have 
included some error variance. 

(d) The circumstances under which the study was 
conducted may have been such as to make the me 
terviewers more cautious and hence more reliable in 
their decisions than they would have been normally. 

(e) The interviewers may have attended to invalid 
information or they may have applied nonoptimal 
weights to information examined. 


The assessment of these factors was beyond 
the scope of this study. Factors (b) and (c) 
would require that a sample of interviewers 
interview the same applicants. Regarding 
(d), all officers reported at the conclusion of 
the project that they had given more thought 
to their decisions during the course of the 
study than was customary. There was also 
reason to believe that personnel selection i» 
the Canadian Army may differ from most 5è 
lection procedures in that the reports of Pe” 
sonnel officers are carefully examined by the?" 
Superiors in the Personnel Selection Service: 
This might be expected to result in greate" 
uniformity among interviewers. ; 

It should be understood that the validity 
„the decision making models developed # 
this study was not investigated, and an | 
quently Factor (e) could not be evaluate: 
The utility of Q-sort methods of the kind al 
here must be evaluated with caution (@ k 
lowup is being planned in which the clinici i 
and statistical scores will be correlated WÍ 5 
criterion measures based on the inducer 
Performance records following three eat 
regular Army Service). A further limita, 
which bears noting is the fact that this Í$ r 
first study of its kind, and further reses her 
into the appropriateness of Q sort to ° 
clinical situations is obviously needed. 


of 


n —— I eee 
——————— Ra 


Ji 


Equivalence of Clinical and Statistical Methods 


Keeping these reservations in mind, this 
study nonetheless strengthens the status of 
the clinical assessment. The results obtained 
suggest that in spite of idiosyncrasies of in- 
terviewing style and content, it is possible 
for interviewers to describe their interviewees 
in comparable and consistent terms. This is 
a rather startling discovery, in view of the 
critical comments which are often directed 
toward clinical methods, a criticism which, 
as a matter of interest, was shared by the 
author at the beginning of the study. Fur- 
thermore, the fact that the clinical scores, as 
developed herein, correlate at a low level with 
biographical and test data suggests that such 
scores might profitably be included as sepa- 
rate test scores in a selection battery. The 
development of such Q-sort-derived measures 
might be a more fruitful line to pursue in test 
development than in the development of con- 
ventional tests. As Cronbach and Meehl 
(1955) have pointed out, the validity of the 
clinician’s constructs deserve a place in psy- 
chological thinking and research. 


Summary 


Linear statistical and clinical models of per- 
sonnel assessment were compared with respect 
to: (a) correlation with interview decisions, 
(b) correlation between models, and (c) 
errors of measurement. ; 

Eight interviewers assessed from 14 to 50 
Canadian Army applicants using information 
obtained from biographical and test data, 
and from interview conversation, Each ap- 


401 


plicant was described on a 120-item Q-sort 
check list. These data were quantified and 
combined into composite statistical scores 
(biographical and test data) and clinical 
scores (Q-sort data). 

The results indicated that: (a) clinical 
scores were associated more closely with de- 
cisions than were statistical scores; (b) sta- 
tistical and clinical scores correlated at a low 
level; (c) the decisions of different interview- 
ers were associated with the same Q-sort, 
biographical, and test data; and (d) sta- 
tistical and clinical scores were comparable 
in reliability. 

It was concluded that the clinical model of 
assessment was not reducible to the linear sta- 
tistical model and that interviewers’ methods 
were comparable and consistent under the ex- 
perimental conditions used. 


REFERENCES 
Cronpacu, L. J, & Merur, P. E. Construct va- 


lidity. Psychol. Bull., 1955, 52, 281-302. 

EZEKIEL, M. Methods of correlation analysis. (2nd 
ed.) New York: Wiley, 1947. 

Guitrorp, J. P. Psychometric methods. (2nd ed.) 
New York: McGraw-Hill, 1954. 

McNemar, Q. Psychological statistics. (2nd ed.) 


New York: Wiley, 1955. 

Meent, P. E. Clinical vs. statistical prediction: A 
theoretical analysis and a review of the evidence, 
Minneapolis: Univer. Minnesota Press, 1954. 

Snepecor, G. W. Statistical methods. (4th ed.) 
Ames: Iowa State Coll. Press, 1948. 

STEPHENSON, W. The study of behavior: Q sort 
and its methodology. Chicago: Univer. Chicago 
Press, 1953. 


(Received February 27, 1959) 


urnal of Applied Psychology 
uN 43, No. 6, 1959 


MOST KNOWLEDGE- 

OUP PERSUASION BY THE 

go MEMBER UNDER CONDITIONS OF INCU- 
BATION AND VARYING GROUP SIZE 


ROBERT C. ZILLER axb RICHARD BEHRINGER ? 


Fels Group Dynamics Cente 


The study reported here investigates some 
conditions under which groups fail to utilize 
the resources of the most knowledgeable mem- 
ber in a group decision making situation. 
The experimental conditions include the pres- 
ence or absence of a silent recess or incuba- 
tion period imposed during the group discus- 
sion as well as variations in group size. 

In the extensive body of literature concern- 
ing the relative problem solving performance 
of groups and individuals (Kelley & Thibaut, 
1954), it is sometimes maintained that vari- 
ous group factors such as group power struc- 
ture (Torrance, 1954; Ziller, 1955) and pres- 
sures toward uniformity frequently interfere 
with the optimum utilization of the group’s 
resources but particularly the resources of the 
most well informed member. By the expedi- 
ent of including an accomplice with a correct 
answer and a correct problem solving process 
in each experimental group, it was possible to 
design and conduct a more definitive analysis 
of this proposition. 

With regard to group size, the influence of 
the informed member (advocate hereafter) 
was expected to decrease as the size of the 
group increases; for as the size of the group 

increases the probability of an opposed coali- 
tion may be expected to increase. Moreover, 
the number of participants was expected to be 


related inversely to the Perceived prominence 
of the advocate’s arguments, 


A recess or 
duced as an in 
investigate the 


incubation period was intro- 
dependent variable in order to 
correlates of incubation in a 
group setting. Research regarding incubation 
in an individual problem solving situation 
suggests that interferences resulting from emo- 


1 This report is an 
at the APA meeting, 
1958. 

2 The authors wish to thank their 
Fels Group Dynamics Center who p 
feedback for the revision of the ori 


extension of a paper presented 
Washington, D. C., September 


associates in the 
rovided valuable 
ginal manuscript, 


402 


r, University of Delaware 


tional causes and sets may dissipate „under 
conditions of enforced delay by serving = 
interrupt a set and allow competition from 
other sets (Schachter, 1951). The og 
effect” (Hovland & Weiss, 1951), that is, t - 
paradoxical increase in opinion change wE 
sometimes appears after a lapse of time fo j 
lowing an induction attempt, lends further 
support to this hypothesis. Thus it was pe 
dicted that a highly informed group member 
(advocate) is more effective in a group de 
cision making situation in which an incuba- 
tion period is imposed in comparison with a 
situation in which there is no such break 10 
the group decision making process. 


Method 


One hundred ninety-nine male and = 
male University of Delaware summer session HA 
dents (largely public school teachers) served as ce 
The experiment was conducted with social scien 


Subjects. 


classes during the regular class period. the 
Task. The decision 


making task required ae 
idual members to estimate the "gots 
ber of dots on a slide containing 1,050 black Sad 
scattered rather uniformly, yet in no geometric ait? a 
over a white background in a figure resembling m 
ping-pong paddle framed by a rectangle of minimu 
dimensions, 


> 8 sec. 
The slide was exposed for only Re ex- 
Experimental Procedure. Essentially, a 2 X 4 
perimental desi 


gn was employed involving four l! 
of group size (two-, three-, four-, and ae Pe a 
groups) and the presence or absence of an par 
tion period. An accomplice, the advocate, Ciske 
cluded in each group under the cight ease 
conditions. Thus, for example, the two-pet 


: a d an 
groups were actually composed of a naive S an 
accomplice, 


The advocates 
domly from the 
study and were 


group or indiv 


levels 
son 


an- 
the 
the 


or accomplices were selected E 
various classes participating i” 
instructed as to their role i od 
day prior to the experimental session. The aire 
which the advocates were asked to employ fee en- 
them to envision the paddle-shaped dot disp ag The 
closed by the rectangle of minimum dimensions. anit? 
Paddle-shaped figure occupied approximately the 
half of the area within the rectangle. TAR al 
number of dots within the figure could be apP. the 
mated by estimating the width and length © 


F 


-a -ÁÁ eS ĖŮ—— 


r 


G 


A 


- 


Persuasion under Incubation and Varying Group Size 


rectangle in dots and dividing the product by two. 
The advocates were informed as to the actual di- 
mensions (in dot-units) of the rectangle, the arith- 
metic process for approximating the correct answer, 
and the correct number of dots, 1,050. Finally, each 
advocate was asked to attempt to persuade his 
group to decide on the estimate of 1,050 dots or as 
near to that number as possible without, of course, 
revealing his role in the experiment. 

At the outset of the experimental period it was ex- 
plained to the naive Ss that the study was concerned 
with group decision making but particularly con- 
cerned with the relationship between group size and 
the quality of the group decisions. After assigning 
the members to varying sized groups (homogeneous 
with regard to sex) the slide was displayed for 5 sec., 
and the Ss submitted individual estimates as to the 
correct number of dots on the slide. In the next 
phase, the various sized groups dispersed to separate 
rooms to discuss the problem and reach a group de- 
cision in a maximum of 15 min. (an adequate pe- 
riod of time on the basis of previous studies involv- 
ing this task). 

Finally, when the groups had submitted their esti- 
mates, the members completed a questionnaire ask- 
ing them to submit an individual estimate as to the 
“number of dots that they personally really thought 
there were on the slide” and to respond to questions 
designed to measure group satisfaction. 

Following the completion of the questionnaire, the 
nature of the study was explained in detail and the 
naive Ss were asked to reveal if they had suspected 
that a number of their group was collaborating with 
E. It became necessary to discard five groups on 
the basis of these responses. . a 

In accordance with the 2 X 4 factorial design in- 
volving incubation and group size, respectively, trose 
Broups in which a recess or incubation period was 
formalized were instructed that after the first 4 min, 
of discussion the group members were to recess. in 
silence for 3 min. and review privately the ae 
discussion. (In pilot studies it had been found mes 
no group completed the problem in less than 4 En 
After the 3 min. recess, the groups were eeni 
to resume their discussion and submit a group ce- 
Cision in a maximum of 11 min. , i 

P res The dependent variables included mear 
ures of influence and group satisfaction. The T 
Measures of influence were: (a) self report, (b) he 
Mean of the members’ indices of estimate came 
€) group decision error, and (d) mean ae Ox the 
Stoup members’ post-group-decision estima es. sn 

The first influence measure was derived a m 
Stoup mean of the weighted responses to the fo o 
Mg item: “I changed my individual estimate of the 
Number of dots a great deal as a result of the eronp 
discussion,” The alternatives were arranged on fa og 

Sint scale varying from “agree very much” to ha 
Agree very much.” This index represents the indi- 
vidual member’s perceived change. While it pis 
uite Possible that this phenomenological measure 

ould not correlate highly with some of the other 

Chavioral measures, it was included since it reflects, 


403 


at the very least, resistance to persuasion from S’s 
point of view. (Actually the correlation coefficient 
between self-report and “members’ change” was 
found to be 0.50. See Table 1.) 

The second measure of influence was derived from 
the ratio of the difference between the accuracy of 
the individual's prediscussion and postdiscussion esti- 
mates to the difference between the prediscussion 
estimate and the correct answer. The numerator 
represents the degree of change toward the correct 
answer and the denominator the degree of change 
possible. An upper limit of 1.00 is inherent in the 
index and a lower limit of zero was imposed. The 
resulting ratio was subjected to the arc sine trans- 
formation. The mean of the members’ transformed 
indices provided the index of group estimate change. 

The third measure of influence was simply the dif- 
ference between the group decision and the correct 
answer (the advocate’s position), and may be inter- 
preted as representing the extent to which the group 
resisted the advocate’s arguments. Because of the 
heterogeneity of these scores, it was necessary to ap- 
ply the cube root transformation, j 

The fourth measure of group influence was very 
similar to the preceding index but the individual 
group member's post-group-decision estimate was 
substituted for the group decision. The mean abso- 
lute error of the group members was calculated and 
again the cube root transformation was applied. 

The following items comprised the questionnaire 
regarding group satisfaction: 


1. I was extremely satisfied with the quality or 
degree of excellence of the decisions reached by 
my group. 

- Consider the entire problem solving session: My 
opinion was given the utmost consideration, 

3. This has been the most stimulating experience 

I've had in a long time. 

4. This experience has greatly increased my re- 
spect for the group method of doing things. 

. If I could have chosen the other members of 
the group myself, I couldn’t have done a better 
job. 


n 


n 


Again the alternatives were arranged on a 6-point 
scale ranging from “agree v much” to “disagree 
very much.” 


Table 1 


Correlations among Persuasion Measures 


Members’ 
Error of 


Groups’ 


Members’ Error of 


Change imate Estimate 
Self-Report 50* 05 a 
Members’ Change —.24 — Al 
Members’ Error of “i 


Estimate 


ia! a aL 


404 Robert C. Ziller and Richard Behringer 
Table 2 
Group Persuasion (Four Measures) in Relation to Group Size and Incubation d 
a Incubation Period No Incubation Period Combined Mean 
Measures Measures Measures 
Grou] a F 
ae A B COD A B CD QA B C Ea 
N N . = 0 
2 (6) 40 50.9 57 48 (9) 46 542 40 3.5 44 529 47 “i 
3 (7) 36 37.7 80 66 (12) 48 546 64 50 44 484 7.0 5 
4 © 34 l 76 T6 7) 45 B4 59 47 40 373 67 H 
5 (7) 46 58.3 64 43 (5) 48 66.7 63 40 47 618 64 = 
Combined Mean*® 3.9 445 69 5.8 4.7 539 5.6 44 
a ysis of Variance: Incubation effect: ignificant at the .05 level vith regard to Measure 
A.C, and BF SGOT LAL, and 415, respectively). Size efecte are signifions ot crea. A ence with regard to Mr vith 
respect to Measure C (F = 2.77). a 
Note.—In Measures A (self-report) and B (members’ estimate changes) a high score represents greater change or persuasion 
In Measures C (error of members’ post-group-decision estimate) and D 


change or persuasion. 
Results 


The correlation matrix involving the four 
measures of influence is given in Table 1. 
While none of the correlation coefficients are 
high, all four measures were retained on the 
basis of the assumption that the different 


Table 3 


Analysis of Variance of Group Satisfaction 
(Questionnaire) in Relation to Group 
Size and Incubation 


Group Size Com- 
— — bined 
2 3 f 53 Mean 
Incubation Period 5.0 4.5 4a 54 i 47 
No Incubation Period 5.2 43 53 49 4.9 
Combined Mean 51 44 47 5.0 4.8 
Source of Variation df MS F p 
Group Size 3 5.00 3.38 “05 
Incubation 1 2.53 1.71 — 
Question 4 6480 43.78 01 
Size X Incubation 3 4.88 330 05 
Size X Question 12 1.18 80 piai 
Incubation X Question 4 .24 t 
Size X Incubation X 
Question 12 -65 Ne 
Within Groups 160 1.48 
Noten arriving at the figures in oat cell ee 


ch group to the five 


The score itself is the mean for the groups within a cell. 


p ter 
(error of group estimate), a low score represents Brea 


measures of influence present various facets 
of the phenomenon under investigation. 

Ina 2 X 4 analysis of variance design with 
an unequal number of groups in each ce 
(Snedecor, 1946), the relationships amons 
the experimental conditions (incubation-?? 
incubation and four variations of group si2 
and the four measures of influence were 27 
lyzed (see Table 2). 

With regard to self-reports (Measure A): 
the mean error of the member’s post-grouP” 
decision estimate (Measure C), and group” 
decision error (Measure D), the no-incuba- 
tion condition is associated, contrary tO 7 
pectations, with greater influence or chare 
in opinion (p = OSY e 

The results with regard to group size We 
statistically significant only with respect t 
the mean error of the group member’s pA 
group-decision estimate (Measure C). He 
it was seen that in the two-person group c 
single naive S), the advocate was most ef@ 
tive. It was also noted that the five-perse 
groups were somewhat more persuade 
terms of this criterion. Moreover, Wit! flu 
gard to each of the four measures of Pe 
ence, the two- and five-person groups jour 
influenced more than the three- an 
Person groups. jo” 

The results concerning group satisfac y- 
were explored by means of a2 X 4 X 5 ar(see 
sis of variance design (Snedecor, 1946): Je t? 
Table 3.) Here, the variance attributaP 


re" 


Persuasion under Incubation and Varying Group Size 


group size was significant at the 0.05 level of 
confidence. Inspection of the combined means 
revealed that the members of two- and five- 
person groups expressed greater satisfaction, 
in general, than the members of three- and 
four-person groups. Moreover, the interac- 
tion effect between group size and incubation 
was statistically significant ( = .05). 


Discussion 


The results of the experiment indicate that 
group size and incubation are related to the 
degree to which the group tends to accept 
the suggestions of the most informed group 
member. However, the results with regard to 
group size were in accord with the original 
hypothesis only with reference to two-, three-, 
and four-person groups. While the three- 
and four-person groups in comparison with 
two-person groups tended to be less accurate 
in their estimates, changed their individual 
estimates less, and expressed less satisfaction 
with the group; the five-person groups in com- 
parison with three- and four-person groups 
tended to change more, were more accurate, 
and expressed greater satisfaction. ; 

The unexpected nature of these results is 
further emphasized by the contradictory re- 
sults of an earlier study (Ziller, 1957) which 
did not employ accomplices but involved a 
similar dot counting task and two-, three-, 
four-, and five-person groups. In the ref- 
erenced study, the three- and four-person 
groups, in general, expressed greater satisfac- 
tion with their groups than two- and five- 
person groups. Moreover, in this same study, 
the five-person groups submitted less accu 
rate estimates of the correct number of dots 
than other groups. Thus, the results of the 
Present study, particularly those results con- 
Cerning five-person groups, can scarcely be 
attributed to size effects alone. 

By way of interpretation of these ange 
pected results it was assumed, for analysis 
Purposes, that the advocate in the presen 
Study was perceived as & deviate when al 
the original estimates of the naive member 
of a group were lower or higher than that o 
the advocate. In the three-, four-, and five- 
Person groups, respectively, the advocates 
Were actually found to be the deviates (as 
defined above) 47, 40, and 17% of the time. 


405 


Thus, there appears to be a tendency for the 
advocate in the five-person group to find him- 
self less frequently defending an extreme po- 
sition relative to the other group members. 
Or again, more simply, there exists a greater 
probability that the scale of judgment of the 
larger group includes the judgment of the 
advocate who thereby being perceived as a 
moderate rather than a deviate may find the 
presumed tendency of groups to compromise 
is in his favor. The results with regard to 
satisfaction also would seem to support this 
position. Unfortunately, the limited number 
of groups in the experiment precludes this 
ad hoc analysis. Nevetheless, the results may 
be interpreted as demonstrating again the un- 
usual characteristics of three- and four-per- 
son groups with regard to the probability of 
coalitions (Mills, 1953). 

With regard to the incubation variable, the 
results were diametrically opposed to the 
initial hypothesis. The most knowledgeable 
member was more effective under the no- 
incubation condition, and the group mem- 
bers also expressed greater satisfaction with 
the group products and processes under this 
condition. i 

Experiments concerning incubation in indi- 
vidual problem solving situations initially 
suggested that interference resulting from 
emotional causes and sets is dissipated in 
time, thus permitting the acceptance of the 
well informed member’s (advocate) argu- 
ments. The “sleeper effect” represented this 
theoretical position. The discrepancy between 
these earlier findings and the findings of the 
present experiment may be explained, in part 
at least, by assuming that since there was 
only a single encounter between the advocate 
and the group in the earlier experiment there 
was less reason for Ss to perceive the recess 
as an opportunity for a private rehearsal of 
arguments in opposition to the advocate. 
Thus, in the present experiment in which the 
recess was followed by further discussion, the 
recess may have served to instigate rather 
than dissipate emotional sets. 


Summary 


The experiment reported here investigated 
some conditions under which a group fails to 
utilize the resources of the most knowledge- 


406 


able group member in a decision making 
situation. By the experimental expedient of 
including an accomplice in each group who 
was informed as to the correct answer and a 
correct method of arriving at that answer, it 
was found that two- and five-person groups 
tended to be more accurate and more influ- 
enced, and expressed greater satisfaction with 
their groups than three- and four-person 
groups; and that groups in which no incuba- 
tion period was imposed were also more ac- 
curate and more influenced, and expressed 
greater satisfaction with their groups. 

The results with regard to the five-person 
groups were tentatively attributed to the in- 
creased scale of judgment in larger groups 
which presumably reduces the probability 
that the advocate must defend an extreme 
position relative to the other group members. 
With regard to the incubation effects, it was 
proffered that during the recess the naive Ss 
prepared for a second defense of their opin- 
ions by revitalizing their initial sets and re- 
establishing their initial arguments. 


Robert C. Ziller and Richard Behringer 


REFERENCES , 


Hovraxn, C. I, & Weiss, W. The influence of source 


credibility on communication effectiveness. Publ. 
opin. Quart., 1951, 15, 635-650. 
Kerrey, H. H, & Tuipaut, J. W. Experimental 


studies of group problem solving and processes. 
In G. Lindzey (Ed.), Handbook of Social Psy- 


chology. Cambridge: Addison-Wesley, 1954. Pp. 
735-785. 

Muts, T. M. Power relations in three-person 
groups. Amer. soc. Rev., 1953, 18, 351-357. 


ScHACHTER, S. Deviation, rejection, and communi- 
cation. J. abnorm. soc. Psychol, 1931, 46, 190- 
207. 

Snevecor, G. W. Statistical Methods. 
Ames: Iowa State Coll. Press, 1946. 
Torrance, E. P. Some consequences of power dif- 
ferences on decision making in permanent and 
temporary three-man groups. Res. Stud., Wash- 

ington State Coll., 1954, 22, 130-140. 

Zitter, R. C. Scales of judgment; A determinant 
of the accuracy of group decisions, Hum. Relat. 
1955, 8, 153-164, 

Zitter, R. C. Group size: A determinant of the 
accuracy and stability of group decisions. Soci- 
ometry, 1957, 22, 165-173. 


(4th ed.). 


(Received March 2, 1959) 


Journal of Applied Psychology 
Vol. 43, No. 6, 1959 


JUDGMENT TIME FOR FORCED-CHOICE 
ADJECTIVE PAIRS 


Carnegie Institute of Technology 


Axnp DORIS NORTHRUP 


University of Illinois 


Individuals are reluctant to use words or 
Phrases of negative emotional tone as descrip- 
tions of self or of others. For this reason, 

| respondents may resist forced-choice items 
_ which present only unfavorable alternatives. 
_ Some inventories attempt to reduce this re- 
| sistance by requesting the choice of the least 


descriptive rather than most descriptive phrase 
for negative items, but there is no evidence 
+ concerning the effect of this procedure. Since 
negative items continue to be used in rating 
© Scales and inventories, it is apparently be- 
lieved that such items play an essential role 
| in obtaining valid descriptions. The evidence 
On this matter also is inconclusive. 

In a study which compared several forms 
of a forced-choice rating scale (Highland & 
Berkshire, 1951) a form composed entirely 
of favorable items seemed adequate, since 
this form showed low biasability, adequate 

| reliability and validity, and was favored by 
_ the users, Also in the area of ratings, Wherry 
1 (1951) showed that while raters actually be- 
ieve positively toned items to be more cru- 
ial to job success and more universally ap- 
Dlicable, these beliefs were only partially sup- 
| Ported by the validity data in his factor 
` analysis of rating item indices. In the area 
of self description, Krug (1958) showed that 
ludgments of unfavorable adjectives were as 
eliable as judgments of positive terms, and 
Mat responses to negative pairs appeared less 
fluenced by the relative favorability (PI) 
of the members of a pair. It was suggested 
| that g is more highly motivated to make an 
* *ccurate self-description in the case of nega- 
| Ve pairs, since the risk appears greater. , 
| ae he present study employed judgment time 
| : the dependent variable in an effort to as- 
ray SS the effects of (a) the favorable-unfavor- 
l a € dichotomy, (b) the response S is re- 
“ested to make (least or most descriptive), 


| ROBERT E. KRUG 
1 


and (c) the degree of PI discrepancy within 
a pair. 


Procedure 


The Ss were 64 senior division men in a college of 
engineering. Their participation in a one-hour in- 
dividual laboratory session was voluntary. 

From a list of 228 adjectives for which the Selec- 
tion Set Preference Index (Krug, 1958) was avail- 
able, 40 forced-choice pairs were constructed, Twenty 
pairs contained two favorable words, the other 20 
being composed of unfavorable words. In each set 
of 20, five pairs contained members identical in PI, 
five had a PI discrepancy of .20 (on a 7.0 scale), 
five a discrepancy of .75, and five a discrepancy of 
1.50. The standard deviation of the PI was matched 
as closely as possible within a pair. 

The S was seated at a table which was isolated 
from E by a large black shield in which a 6 X 10-in. 
flash glass screen was mounted at S’s eye level. The 
pairs of adjectives were projected individually onto 
this screen from a Bell and Howell automatic slide 
projector. The presentation of a pair was preceded 
by a 2-sec. warning light which was the signal for 
S to attend to the screen. The onset of the pair 
started an electronic chronoscope, which was termi- 
nated by S moving one-of the two switches mounted 
on the table before him. Moving the right-hand 
switch indicated that the right-hand word was the 
choice; moving the left-hand switch indicated a 
choice of the word appearing on the left. For the 
pairs with PI discrepancies greater than zero, the 
left-hand word had the higher PI in half of the 
cases. The sequence of pairs was randomized; each 
S was assigned one of 16 random sequences. 

In addition to instructions designed to familiarize 
S with the apparatus and the sequence of events, the 
following instructions were given relevant to the 
judgmental task: “The research we are conducting 
deals with personality test items. The items will be 
presented to you one at a time and each will con- 
sist of two adjectives. Your task is to decide which 
of the two adjectives is a more accurate description 
of you. This may not always be easy; there may be 
pairs such that both members will be adequate de- 
scriptions of you, and others where you will feel 
that neither are. Nonetheless, you are to make a 
choice in each case. The task is not a speed test, 
and we are not asking for snap judgments. A siti 
response is not better than a slow one. You shoul 


1 407 


408 
Table 1 
Summary of Analysis of Variance 
Source dj MS F Error 
R £ 4.7860 2.996 (a) 
(a) Ss within R 62 1.5977 
D 3 -2212 9.178*** (b) 
F 1 0870 2.990 (c) 
DXF 3 1254 5.098*** (d) 
RXD 3 0872 3.618* (b) 
RXF 1 -2202 7.567%* (c) 
RXDXF 3 .1645 6.687*** (d) 
(b) D X Ss 186 .0241 
(c) FX Ss 62 .0291 
(d) DXFXSs 186 .0246 
G 
** p = 01. 
wn, = 001. 


treat each item just as you would if you encoun- 
tered it on a printed personality inventory. How- 
ever, we are interested in judgment time, and it is 
important that you respond as soon as your decision 
is reached. Remember, the most important thing is 
that we want to obtain an accurate self-description, 
so please make your choices as seriously as possible.” 
Ss were randomly assigned to one of two response 
conditions. For 32 Ss, the response was in terms of 
the most descriptive adjective. For the remaining 
32, the instructions were modified to request the 
choice of the least descriptive term in each pair.1 


Analysis of Results 


The time in milliseconds was recorded for 
each response, and a log transformation ap- 
plied. The mean log time for each S for the 
five pairs at each treatment-level combina- 
tion provided the basic data for analysis, 

Since each S responded at all levels of PI 
discrepancy (D) for both favorable and un- 
favorable pairs (F) using one of the response 
conditions (R), the appropriate analysis of 
variance design is Lindquist’s mixed design, 
Type VI (Lindquist, 1956, pp. 292-297), 
Table 1 presents the summary of this analy- 
sis. Of the main effects, only D is statisti- 

1 To correspond with the u: 
least choice should be made t 
and most choices to the favo 
indicated that the intrusion 


struction just before the p 
anticipatory set which affect 


sual inventory-practice, 
o the unfavorable pairs, 
rable pairs. Pilot work 
of the specific choice in- 
air appeared led to an 
ed judgment time. 


The 
procedure actually employed seemed preferable to 
the other alternative of presenting the two sets of 


20 pairs separately. 


Robert E. Krug and Doris Northrup 


cally significant, all second-order interactions 
are significant at the .05 level or beyond, and 
the R xX D x F interaction is significant at 
the .001 level. Figure 1 illustrates this third- 
order interaction; the figure also serves as 4 


table of means from which other effects may 
be plotted. 


Discussion 


An implicit assumption of the study was 
that judgment time might be taken as an 
index of S’s willingness to respond to a forced- 
choice item. This assumption is supported 
by the general decrease in judgment time as 
PI discrepancy increases. It must be noted, 
however, that the over-all curve is not a con- 
sistently decreasing one; mean time at the 
third level of D is slightly greater than at the 
second level. On the surface, this suggests 
that the pairs chosen to represent Do and Ds 
do not adequately represent the intended di- 
mension. That this is not the explanation 
will be demonstrated in the discussion of the 
interactions. For the moment, we wish tO 


MEAN LOG-TIME 


INTRA-PAIR DISCREPANCY 
Fic. 1. 


Forced-Choice 


state simply that the significant effect of D 
indicates that the dependent variable pos- 
Sesses relevance and that our results are gen- 
erally consistent with the previous finding 
of a relationship between decision time and 
stimulus difference (Festinger, 1943). 

The hypothesis that Ss are reluctant to re- 
spond to pairs containing only negative al- 
ternatives is not supported. Not only is the 
effect of F nonsignificant, but the observed 
difference is in the opposite direction. Past 
evidence concerning such resistance is pri- 
marily anecdotal, based on comments of re- 
Viewers and respondents. The data of this 
study suggest that such commentary tells us 
very little about the actual behavior of Ss in 
the forced-choice test situation. 

Mean time for least descriptive choices are 
consistently higher than for most descriptive 
choices, but the difference is not statistically 
significant.* In view of this, comment con- 
cerning the practice of asking for differential 
responses on inventories will be deferred until 
the interactions have been considered. 

The interaction which is so evident in Fig. 1 
is a surprising one. Two curves (favorable 
words, least descriptive responses; and unfa- 
vorable words, most descriptive responses) 
show a consistent downward trend. The re- 
maining two curves show a marked increase 
at the third level of PI discrepancy. These 
peaks cannot be attributed to the particular 
sample of pairs, nor to inadequate estimates 
of PI discrepancies, since each sample of pairs 
behaves properly under one of the response 
Conditions. Since different Ss are involved 
in the two response conditions, one might in- 
voke this difference to account for the ob- 
Served interaction. However, the assignment 
of Ss to form two homogeneous groups with 
Such different response characteristics is 
Viewed as extremely improbable. It would 
Seem that we are faced with an interaction 
Which is real rather than artifactual, but 
Which is rather difficult to explain. 

We might ask which, if either, of the sets 
r ., 

? Individual Ss use different portions of the time 
Scale, and these absolute differences are sufficient to 
negate the apparent difference between the two 
Classes of response. It should be noted, however, 
hat Ss are not differentially affected by the D and 

Variables; the mean squares for D X Ss, F X Ss, 


and DXF X Ss are of a magnitude which warrants 
© assumption that these reflect pure error. 


Adjective Pairs 409 
of curves (the peaked ones or the consistently 
decreasing ones) are associated with response 
bias. An analysis of the frequency with which 
the more favorable member of the pair was 
selected at the third level of D failed to give 
a consistent answer.* For unfavorable Pairs, 
there is no difference between most and least 
responses (proportions of .65 and -66). For 
favorable pairs, a significantly greater propor- 
tion (.65) chose the more favorable word un- 
der the most condition than under the least 
condition (.53). In other words, the bias 
which is expected for pairs with a PI dis- 
crepancy of .75 characterizes three of the four 
relevant points in Fig. 1. The only condition 
which does not show the expected bias is 
least response, favorable pairs. This point is 
on a consistently decreasing curve. 

The peaked curves are associated with the 
response conditions which prevail on many 
forced-choice inventories. It must be ob- 
served that this finding does not challenge the 
adequacy of such a procedure, provided that 
the intrapair PI discrepancy is zero. In fact, 
the four points of Fig. 1 which represent a 
zero discrepancy support the standard pro- 
cedure. However, if PI matching is inade- 
quate, the introduction of the differential re- 
sponse complicates matters considerably. The 
reason for this complication is not known. One 
might postulate an inhibition factor which in- 
teracts with R and F, and which becomes im- 
portant when pair members are discriminably 
different on a favorability dimension (S sus- 
pects a trick, and reconsiders his response), 
but our data cannot test this. What is dem- 
onstrated is that for two combinations of pair 
favorability and response, there is a level of 
PI discrepancy which is associated with in- 
creased response time. In this study, this 
level approximates one standard deviation of 
the Preference Index.* 


Summary and Conclusions 


1. In general, the time required for S to 
choose one of the words in a forced-choice 


3In this discussion, the reference is to the term 
preferred as a description; ie., the term chosen in 
the most descriptive set, and the term not chosen in 
the least descriptive set. 7 

4PI sigmas varied from .50 to 11; an average 
standard deviation would be near .75, which was the 


discrepancy at Level 3. 


410 


pair decreases as the intrapair PI discrepancy 
increases. 

2. Ss use no more time in responding to 
unfavorable pairs than to favorable ones. 
The reputed resistance to unfavorable alterna- 
tives does not appear. 

3. A complex relationship is demonstrated 
between PI discrepancy, the favorability of 
the pair and the type of response which is to 
be made. When S chooses the least descrip- 
tive term of two unfavorable words, or the 
most descriptive of two favorable words, in- 
creased judgment time is associated with an 
intermediate level of intrapair PI discrepancy. 

4, As a means of controlling bias, adequate 
pairing of forced-choice terms may be defined 
as a zero discrepancy within the pair. If a 
forced-choice inventory contains pairs which 
are not adequate by this definition, no satis- 
factory recommendation is available concern- 
ing the type of response to be requested. In 


Robert E. Krug and Doris Northrup 


a limited sense, it would be preferable to use 
one (either most or least) rather than two 
types of response, given variance in PI dis- 
crepancy. The adequate solution is to have 
no such variance. 


REFERENCES 

Festincer, L. Studies in decision: I. Decision-time, 
relative frequency of judgment, and subjective 
confidence as related to physical stimulus differ- 
ence. J. exp. Psychol., 1943, 32, 291-306. 

Hicuianp, R. W., & BERKSHIRE, J. R. A meth- 
odological study of forced-choice performance rat- 
ing. USAF Hum. Resour. Res. Cent. Bull, No- 
51-9, 1951. 

Kruc, R. E. A selection set preference index. J. 
appl. Psychol., 1958, 42, 168-170. 

Linpguist, E. F. Design and analysis of experi- 
ments in psychology and education. Boston: 
Houghton Mifflin, 1956. 

Wuerry, R. J. Factor analysis of rating item in- 
dices. AGO, PRS Tech. Res. Rep. No. 915, 1951- 


(Received March 5, 1959) 


zI 


~ 


—s) 


Journal of Applied Psychology 


Vol. 43, No. 6, 1959 


THE LOWRY TEST: 


A SIMPLE STATUS-FREE MEASURE OF 
INTELLECTUAL ABILITY 


DELL LEBO 


Child Guidance and Speech Correction Clinic, Jacksonville 


ROBERT S. ANDREWS 


Quartermaster Field Evaluation Agency, Fort Lee 


AnD OMER LUCIER 
Courtney and Company, Philadelphia 


Factors which introduce biases into intelli- 
gence tests have been of concern to psycholo- 
gists for some years. Socioeconomic status 
has been demarcated as one important source 
of such bias. This recognition has stimulated 
efforts to purify existing tests or develop new 
ones which are less subject to such bias. 

The Lowry Reasoning Test Combination ' 
is one such effort. It employs stimulus ma- 
terials common to all status groups, thereby 
eliminating most social status bias. In addi- 
tion, it is brief, easily administered, and mini- 
mizes the verbal aspect. 

The test consists of two sets of 25 ques- 
tions. Stimulus items for both sets are drawn 
from constructs which are presumably familiar 
to all individuals beyond childhood in our 
culture, i.e., days of the week, squares, and 
matchsticks. Variance in concept difficulty 
is obtained by altering combinations while 
simultaneously maintaining a relatively con- 
stant level of word difficulty. 

The first group of questions involves rea- 
soning problems using the days of the week 
in a variety of ways. They are so phrased 
that there is an increase in difficulty with each 
succeeding question without any concomitant 
increase in the difficulty of the verbal mate- 
tial needed to understand the problems. In 
this way the verbal symbolism remains stated 
in simple words while S’s reasoning ability is 
put to further test. 

Printed directions inform S that: 
answer is a day of the week. Remember that 
Sunday is always the first day of the week. 
~ aDest copprighited 1956 by Ellsworth Lowry and 


Omer Lucier. Copies are available from the latter, 
1711 Walnut Street, Philadelphia. 


“Each 


The days of the week then appear in a num- 
bered, horizontal list so that this information 
is always before S as he works on this series 
of questions. To make sure S understands 
what is required of him, four sample ques- 
tions are given. Two of them are: “If to- 
day were Saturday, what would tomorrow 
be?” “If today were the first day of the 
week, what was yesterday?” The first of the 
nonpractice questions is, “If today were the 
third day of the week, the day after tomor- 
row will be what day?” The last of these 
questions is, “If the odd days of the week 
came first, in order, then the even days, and 
if then the order were reversed, and if Mon- 
day were the first day of the week, what 
would be the fifth day of the week?” 

The earlier and simpler questions are in- 
tended to serve as learning situations for the 
later and more difficult combinations. The 
later items are almost impossible to solve 
without first attempting to solve all preced- 
ing problems. 

The second group of questions also starts 
with simple problems which progress in diffi- 
culty and require the solution of earlier items, 
This group consists of squares drawn by non- 
touching lines. The S is first shown three 
adjacent, numbered squares, made of simple 
lines in a vertical ladder-like sequence. He is 
asked to imagine that the squares were made 
with matchsticks. “How many matches must 
be removed so that the square numbered 1 
will be entirely gone, but the other two 
squares will remain complete?” Two other 
practice problems based on the same design 


are given. 


H1 


412 


The first and second questions of the 25 
problems depend upon three similarly made, 
but differently arranged, squares. Squares 
numbered 1 and 2 are horizontal with the 
third square directly beneath the second. The 
S is asked, “How many matches must be re- 
moved so that Square Number 2 will be elimi- 
nated—be entirely gone—leaving the other 
two complete,” and “By removing two 
matches, only, which square can be elimi- 
nated.” 

The last seven items are based on a design 
composed of three rows of 11 squares. The 
first row of two squares is depicted atop a 
second row of four, so that Squares 1 and 2 
(the first row) are centered directly above the 
midmost squares (four and five) of the second 
row. The last row consists of five squares 
shown as evenly aligned with the second row, 
save that the last square, Number 11, has no 
block above it since the second row consists 
of only four squares. The last two questions 
ask, “What is the largest sum possible of 
three squares that can be eliminated by re- 
moving three matches?” and “What is the 
smallest sum of three squares that can be 
eliminated by removing three matches?” 
Once again increased difficulty is obtained 
through varying the complexity of the de- 
signs and task while the difficulty of the sym- 
bols remains constant. 

As can be seen, both types of questions 
start with items so simple that a child could 
solve them. They proceed in difficulty so 
that even sophisticated and intelligent adults 
find the last items challenging. Both sub- 
tests are timed. Fifteen minutes are allowed 
for the first group of questions and 20 minutes 
for the second. Even though these time limits 
are provided, the items constitu 
test. 

Research with the Lowry test has shown 
that it measures much the same abilities as 
the California Test of Mental Maturity and 
is less influenced by social status bias (Lucier 
& Burnette, 1957). Similar results were 
obtained when the Lowry and Cooperative 
School and College Abilities Test were com- 
pared with each other and in relation to so- 
cial status bias (Lucier & Farley, 1957), 

More recently, it has been used with adults 
in comparison with the general technical score 


te a power 


Dell Lebo, Robert S. Andrews, and Omer Lucier 


(GTS) of the Army Classification Battery. 
This score is taken frequently as being equiva- 
lent to a measure of general intelligence. A 
correlation of .70 was obtained. ACB sub- 
tests which seemed related to the commonly 
accepted measures of intellectual functioning, 
i.e., reading and vocabulary, arithmetic rea- 
soning, and pattern analysis were also ex- 
amined against the Lowry test. Correlations 
of .66, .63, and .55 respectively were secured. 
The Lowry test was also found to correlate 
45 against ratings of performance of ob- 
server-recorder personnel in the Quartermas- 
ter Corps, whereas the GTS achieved a cor- 
relation of .34. All of these correlations, with 
the exception of the last, were significant at 
the .01 level. The last reached significance 
at the .05 level. It was concluded that the 
Lowry Reasoning Test Combination was more 
efficient in identifying soldiers capable of ob- 
server-recorder duties than were components 
of the ACB which were usually used. The 
Lowry test was also found to be more status 
free than the GTS (Andrews, Lebo, & Lucier, 
1959). 
Summary 


The Lowry Reasoning Test Combination 
has been found to be relatively free of socl4 
Status bias and to measure intellectual func- 
tion. It is easily administered and simply 
scored and does not depend upon a high level 
of verbal ability. Variance in concept diffi- 
culty is obtained by altering combinations ° 
Constructs while keeping the verbal materia 
on a uniformly simple level. Wherever our 
a discriminative and effective selection device 
is needed the present writers would recom- 
mend that the Lowry test be tried. 


REFERENCES 


ANDREWS, R. S, Leso, D., & Lucr, O. A fon 
tional approach to the validity of a purported 
Status-free test of intellectual ability. Paper m 
at Southeast. Psychol. Ass., St. Augustine, AP 
1959, 

Lucier, R. O., & BURNETTE, R. The Lowry Reaser 
ing Test Combination with younger adolesce” 
Amer. Psychologist, 1957, 12, 373. (Abstract) two 

Lucier, R. O., & Fartey, J. A. A validation of as: 
tests purported to be free of social status 


ille, 
Paper read at Southeast. Psychol, Ass., Nashv? 
March 1957, 


(Received March 13, 1959) 


OE 
SS —_ Sa —<—- # - 


Po ——— 


Journal of Applied Psychology 
Vol. 43, No. 6, 1959 


INFLUENCE OF BRAINSTORMING INSTRUCTIONS 
AND PROBLEM SEQUENCE ON A CREATIVE 
PROBLEM SOLVING TEST' 


ARNOLD MEADOW 


University of Arizona 
SIDNEY J. PARNES axp HAYNE REESE 
University of Buffalo 


Experimental investigations of problem solv- 
ing have identified several variables which in- 
terfere with performance on problem solving 
tasks, and others which facilitate perform- 
ance. Among the former variables are pre- 
liminary experience with the components of 
a problem in a different context from the 
test problem (Birch & Rabinowitz, 1951), 
failure (Lazarus & Eriksen, 1952), frustra- 
tion (Mohsin, 1954), and rigidity (Luchins, 
1942). Among the latter variables are in- 
structions including the phrase “don’t be 
blind” (Luchins, 1942), praise (Cowen, 
1952), instructions to be “clever” (Christen- 
sen, Guilford, & Wilson, 1957), and train- 
ing in creative problem solving emphasiz- 
ing the brainstorming procedure (Meadow & 
Parnes, 1958). The present paper describes 
the results of an experiment designed to 
evaluate further the effects of the brainstorm- 
ing method on creative problem solving. 

In the brainstorming procedure S segregates 
in time the formation of a solution and the 
judgment of its efficacy or value. The S is 
encouraged to express any possible solution 
which comes to mind during G ae R 

ostponing the evaluation of the solution to 
7 n> time (Osborn, 1957). Meadow and 
Parnes (1958) found that Ss who had taken 
a one-semester course in creative problem 
solving, which emphasized the brainstorming 
Procedure, were significantly superior on five 
of seven measures of creative ability to a 
group of matched control Ss who had not 
taken the course. This difference in perform- 
ance cannot be unequivocably attributed to 
the difference in training in the brainstorming 
Procedure, because experimental and control 
groups also differed in other variables em- 


Phasized in the creative problem solving 

_1 This res rom th 
1This research was financed by a grant from the 
Teative Education Foundation. 


course, which involved constant practice with 
problems requiring creative ability. 

The present experiment was designed as a 
test of the effectiveness of the brainstorming 
procedure, using only Ss who were members 
of the course in creative problem solving in 
order to control for the amount of previous 
experience in the various problem solving 
methods. Each S was given two problems 
which required creative ability, in two test- 
ing periods. One problem was administered 
under brainstorming instructions, which al- 
lowed Ss to formulate possible solutions with- 
out evaluating them; the other problem was 
administered under nonbrainstorming instruc- 
tions, which required Ss to formulate and 
evaluate solutions simultaneously. The qual- 
ity of the solutions was later evaluated by a 
trained rater. It was expected, on the ba- 
sis of previous experimentation (Meadow & 
Parnes, 1958), that more solutions of good 
quality would be produced under the brain- 
storming instructions than under the non- 
brainstorming instructions. 


Method 


Subjects. The Ss were 32 college students from 
two courses in creative problem solving, one given 
at the University of Buffalo and the other at Mc- 
Master University. The experiment was conducted 
during the final two weeks of the semester, The Ss 
were randomly divided into four experimental groups, 
each containing eight Ss. 

Experimental problems. Two problems, the Hanger 
problem and the Broom problem, were selected from 
the AC Test of Creative Ability. Since the AC 
Test is reported to differentiate “creative” from 
“noncreative” Ss (Harris & Simberg, 1954), the two 
problems selected presumably require creative abil- 
ity. The Hanger problem was used in a Previous 
experiment, in which it was found that Ss trained 
in brainstorming performed at a significantly higher 
level on the problem than nontrained Ss (Meadow 
& Parnes, 1958). The problem required Ss to list 
other uses for a hanger or a broom. 


413 


414 


Procedure. The Ss were given first one problem, 
and then the second immediately thereafter. All 
tests were group administered to each of the four 
groups separately. One problem was given under 
brainstorming instructions and the other under non- 
brainstorming instructions. The essentials of the 
brainstorming instructions were as follows: Brain- 
storm to your fullest ability; forget about quality 
entirely. We are going to count only quantity on 
this test. . . . Quality is of no concern at all. The 
essentials of the nonbrainstorming instructions were: 
“Forget all about brainstorming. Strive completely 
for quality. We want to see how many good ideas 
you can produce in a certain amount of time. You 
are going to be penalized for any bad ideas. Any 
ideas rated as poor will be subtracted | from your 
score... .” The Ss were allowed 5 min. for each 
sae the Ss were given first the Hanger prob- 
lem and then the Broom problem; the other half 
were given the problems in the reverse order. Simi- 
larly, half of the Ss were given first the brainstorm- 
ing instructions and then the nonbrainstorming in- 
structions, and the other half were given the instruc- 
tions in the reverse order. The design is illustrated 
in Table 1. The three experimental variables, In- 
structions, Problems, and Test Periods (first and 
second), were all within Ss factors, and Lindquist’s 
Type V analysis of variance (Lindquist, 1953) was 
used for statistical analysis. 

Ratings. Each response was copied onto a sepa- 
rate slip of paper, given a code number, and then 
presented to the rater for evaluation; hence the rater 
was never aware of whether he was scoring a re- 
sponse produced under the brainstorming or the 
nonbrainstorming instructions. The rater was in- 
structed to rate each response on 3-point scales of 
(a) uniqueness—the degree to which the response 
departed from the conventional use of the object, 
and (b) value—the degree to w 
was judged to have social, 
other usefulness. The rater had previously used 
these scales in connection with other research 
(Meadow & Parnes, 1958), Any response which 
duplicated in essential meaning any other response 
was eliminated from the scoring. A response was 


hich the response 
economic, aesthetic, or 


Table 1 


Experimental Design 


First Test Period Second Test Period 


Arnold Meadow, Sidney J. Parnes, and Hayne Reese 


Table 2 


Mean Numbers of Good Solutions 


Test Periods 


First Second Both 
Instructions: Test Test Tests 
B 9.00 6.88 7.94 
NB 3.25 4.50 3.88 
Problems: 
Hanger 7.62 5.06 6.34 
Broom 4.62 6.31 5.47 


a F, ; z ain- 
Note.—B refers to brainstorming; NB refers to nonbrai 
storming. 


designated “good” if the combined uniqueness and 
value score was 5 or 6. The performance measures 
were the number of good responses and the number 
of good responses expressed as a percentage of the 
total number of responses. 

The inter-rater reliability for the Broom 
for ratings of 30 Ss selected at random, 
The inter-rater reliability for the H: 
was not determined 
been found to be 


problem, 
was 91. 
problem 


anger 
i but had 


for the present data, 5 
-74 in a previous experiment 107 
volving the same raters (Meadow & Parnes, 1958): 


Results 


The mean numbers of good solutions pro” 
duced under the two instructions, for the first 
and second test periods separately, are pre- 
sented in Table 2. These data show that 
more good solutions were produced under the 
brainstorming instructions than under the 
nonbrainstorming instructions, and that this 
effect was greater in the first test period than 
in the second. These effects were statisti- 
cally significant, as indicated by the signifi- 
cant main effect of Instructions and the s!& 
nificant interaction between Instructions ok 
Test Periods (see Table 3), The simple € 
fects involved in this interaction were ane 
lyzed by means of the Cochran-Cox appro 
mate £ test (Cochran & Cox, 1950, pp- es 

Significantly more good solutions 
Produced under the brainstorming instru 


See ey 93). 
Instruc- iia: 
Group tions Problem tions Problem 
1 B Hanger NB Broom 
2 B Broom NB Hanger experiment corr 
3 NB Hanger B Biiom as follow: 
4 NB Broom B Hanger 


Note.—B refers to brainstorming; NB ri 


efers to nonbrain- 
storming. 


+ n 
tions when they were given first than whe 
sous 
“Scores on the Hanger problem in the previo s 
elated with other creative ability 
s: Guilford’s Unusual Uses, 473; Gui 3013 
Plot Titles High, .452; Guilford’s Apparatus “cor 
TAT Originality, 520." All but the Apparatus The 
relation were at the 01 level of significane® jgcant 
correlation with the Apparatus test was sig? 
at the .05 level (Meadow & Parnes, 1958)- 


Brainstorming Instructions and Problem Sequence 


they followed nonbrainstorming instructions 
(t = 2.30, df= 28, 05 < p > .02). There 
was no significant difference in performance 
under nonbrainstorming instructions in the 
first test period as against the second (¢ = 
1.36, df = 28, p > .10). 

The mean numbers of good solutions pro- 
duced on the Hanger and Broom problems 
for the first and second test periods are given 
in Table 2. These data suggest that there 
were more good solutions of the Hanger prob- 
lem than the Broom problem in the first test 
period, but that this trend was reversed in 
the second test period. This effect was sta- 
tistically significant, as indicated by the sig- 
nificant Problems by Test Periods interaction 
in Table 3. The Cochran-Cox approximate 
# test was used to analyze the simple effects 
involved in this interaction. For the Hanger 
problem the difference between performance 
in the first and second test periods was sta- 
tistically significant (¢ = 2.78, df = 28, p< 
01), but for the Broom problem the differ- 
ence between the first and second test periods 
was not significant (¢ = 1.84, df = 28, 10 < 
p> 05). 

The over-all difference between the first 
and second test periods and the over-all dif- 
ference between the Hanger and Broom prob- 
lems were not significant. 


Table 3 


Summary of Analysis of Variance of Absolute 
“Number of Good Solutions 


Source df MS F p 
Between Ss 31 n 
1 01 <1. ns 
; a 7 1 §11 «05 
PXT 1 8.11 <.01 
Error (b) 28 8.91 
Within Ss = i 
Instructions (I) 1 264.06 pe Ed 
Problems (P) 1 12.25 sy ns 
Test Periods (T) 1 3.06 <1 ns 
LX PRT 1 15.99 3.48 ns 
Error (w) 28 4.59 
Total 63 
Note.—Instructions refer to Brainstorming vs. Nonbrain- 


Storming; Problems refer to Hanger vs. Broom; l Period 
refer to the first test administration vs. the second adminis- 
tration, 


415 
Table 4 
Summary of Analysis of Variance of 
First Test Period Only 
Source df MS F p 
Brainstorming— 

Nonbrainstorming 1 235.00 31.52 <.001 
Broom—Hanger 1 72.00 8.58 <.01 
Interaction 1 8.00  <1.00 ns 
w-cells 28 8.39 

Total 31 


Since both Instructions and Problems inter- 
acted with Test Periods, the data for the first 
test period only were analyzed separately, by 
means of a factorial analysis of variance 
(Lindquist, 1953), summarized in Table 4. 
The main effects of Instructions and of Prob- 
lems were both Statistically significant 
cating that more good solutions were pro- 
duced in the first test period under the 
brainstorming instructions than under the 
nonbrainstorming instructions, and that there 
were more good solutions of the Hanger prob- 
lem than the Broom problem. 


, indi- 


Discussion 


The present experiment tests the assump- 
tion that the brainstorming method leads to 
an increase in creativity on certain creative 
thinking tasks. The findings support this as- 
sumption to the extent that our measure of 
creativity is a valid one. The E's concept of 
creativity, defined by the rating scale, has 
face validity. More importantly, it has con- 
current validity, in that it correlates signifi- 
cantly (though not highly) with other meas- 
ures of creative ability. 

Although the principal training method uti- 
lized in the previous experiment (Meadow 
& Parnes, 1958) was “brainstorming,” other 
supplementary training methods described by 
Osborn (1957) were also employed. The re- 
sults of the present experiment accordingly 
provide a more decisive test of the efficacy of 
the brainstorming method per se. 

One point should be emphasized with re- 
spect to the interpretation of both experi- 
ments. Trained Ss were utilized in each 
study. An experiment designed to evaluate 


416 Arnold Meadow, Sidney J. 
the effects of brainstorming with untrained 
Ss is now in progress. 


Summary 


The experiment was designed to study the 
effects on creative problem solving of instruc- 
tions to express solutions without evaluation 
(brainstorming) and instructions which re- 
quired only solutions of good quality and 
which involved a penalty for solutions of bad 
quality (nonbrainstorming). Each § was 
given two problems which required creative 
ability, in two testing periods. One problem 
was administered under brainstorming in- 
structions; the other problem was adminis- 
tered under nonbrainstorming instructions. 
The quality of the solutions was later evalu- 
ated by a trained rater. 

The major findings of the experiment were 
the following. (a) Significantly more good 
solutions were produced under the brain- 
storming instructions than under the non- 
brainstorming instructions. (b) Significantly 
more good solutions were produced under the 
brainstorming instructions when they were 
given first than when they followed nonbrain- 
storming instructions. There was no signifi- 
cant difference in the nonbrainstorming per- 
formance in the two test periods. 


Parnes, and Hayne Reese 


REFERENCES 


Bircn, H. G., & Rasryowitz, H. S. The negative 
effect of previous experience on productive think- 
ing. J. exp. Psychol., 1951, 41, 121-125. 

CurisTensen, P. R, Gurtrorp, J. P., & WILSON, 
R. C. Relations of creative responses to working 


time and instructions. J. exp. Psychol, 1957, 53; 
82-88. 
Cocnran, W. G., & Cox, Gertrupe M. Experi- 


mental designs. New York: Wiley, 1950. 

Cowen, E. L. Stress reduction and problem-solv- 
ing rigidity. J. consult. Psychol., 1952, 16, 425- 
428. 

Harris, R. R, & Srarperc, A. L. AC test of crea- 
tive ability. (Examiner’s manual.) AC Spark 
Plug Division, General Motors Corp., 1954. 

Lazarus, R. S„ & Errxsen, C. W. Psychological 
stress and its personality correlates: Part I. The 
effects of failure stress upon skilled performance- 
J. exp. Psychol., 1952, 43, 100-105. P 

LıxpouisT, E. F. Design and analysis of experi- 
ments in psychology and education. Boston: 
Houghton Mifflin, 1953. 

Lucas, A. S. Mechanization in problem solving- 
The effect of Einstellung. Psychol. Monogr., 1942, 
54 (6), (Whole No. 248). 

Merapow, A., & Parnes, S. J. Evaluation of training 


in creative problem solving. J. appl. Psychol. 
1959, 43, 189-194, 


Mousm, S. M. Effect of frustration on problem- 
solving behavior. J. abnorm. soc. Psychol. 1954 
49, 152-155. 


Osnorn, A. F. Applied imagination, New York: 
Scribner’s, 1957, 


(Received March 16, 1959) 


he 


A 


SI ee. 


Journal of Applied Psychology 
Vol. 43, No. 6, 1959 


OUTPUT RATES AMONG MACHINE OPERATORS: II. 
CONSISTENCY RELATED TO METHODS OF PAY 


HAROLD F. ROTHE ann CHARLES T. NYE 


Fairbanks, Morse and Company, Beloit, Wisconsin 


Some time ago a study was made of the 
consistency of output of a group of machine 
operators (Rothe, 1947). That study re- 
vealed a relative lack of consistency of pro- 
duction from one two-week period to the next. 
The operators were paid on a straight hourly 
rate, and not by any type of incentive system. 

Other studies of the output of various 
groups of industrial operators showed, in gen- 
eral, rather low consistency of performance 
when a day rate pay system was in effect, 
and a high consistency when an incentive pay 
system was used (Rothe: 1946, 1951; Rothe 
& Nye, 1958). These studies also indicated 
another difference that seemed to vary ac- 
cording to the method of payment in effect; 
namely, under an incentive system the aver- 
age output of a group of operators over a pe- 
riod of time showed greater variability than 
did the output of a single operator observed 
over a period of time; and, conversely, with- 
out an incentive system, the output of a sin- 
gle operator observed over a period of time 
showed greater variability than did the aver- 
age output of a group of operators over a 
period of time. , 

These observations led to the formulation 
of two hypotheses relating output consistency 
to the adequacy of the incentives in operation. 
The hypotheses are stated here as follows: 
(a) “the incentives to work may be consid- 
ered ineffective when the ratio of the range 
of intra-individual differences is greater than 
the ratio of the range of interindividual dif- 
ferences” (Rothe, 1946) and (b) if the inter- 
correlation of output rates for two periods 
closely related in time is less than .70, the 
incentivation is not highly effective, while in- 
tercorrelation higher than .80 indicates effec- 
tive incentivation (Rothe & Nye, 1958). 

These hypotheses have never been tested 
experimentally. It is quite likely that they 
never will be, since an experiment seems al- 
most doomed to remove some of the aspects 


of a normal industrial situation. The best 
check of the validity of these hypotheses ap- 
parently is to make study after study of 
workers at their work places to see if these 
observations continue to hold true. 

The purpose of this paper is to report an- 
other study of industrial machine operators, 
working under different conditions of financial 
incentivation. In the present instance it is 
necessary to keep anonymous the name of the 
plant involved. It shall be called Plant B. 
Where reference is made to the earlier study 
of machine operators, the name Plant A will 
be used in this paper. 


Background of the Study 


Plant A, previously reported, involved 130 men 
over a six-week period. The men were paid an 
hourly rate (nonincentive) but, since standards ex- 
isted for each job, it was possible to compare the 
output of men on various types of machines and 
operations. The plant was located in a Wisconsin 
city of between 5- and 6,000 persons, and Plant A 
was by far the largest industry in the city. 

Plant B is also located in Wisconsin, again in a 
city of about 6,000 persons. It is one of several 
manufacturing plants in town, no one of which is 
relatively as large as was Plant A in its town. Plant 
B has several hundred employees in its machine 
shops. Their work is generally similar to the work 
of the employees of Plant A. One outstanding dif- 
ference is that the employees in Plant B are paid 
according to an incentive system. Data for 42 men 
over a period of 10 weeks in 1958 were taken from 
the factory records. 

The most readily observable differences between 
the two plants were: (a) Although both plants were 
located in small cities, Plant A was by far the largest 
establishment in its city, while Plant B was one of 
several approximately equal sized plants in its city; 
(b) Plant A was mainly locally owned, while Plant 
B is a branch plant of a company whose main offices 
are elsewhere; (c) Plant A paid by an hourly rate 
while Plant B paid according to a financial incen- 
os at interesting observation should be made. 
At the time of the study in Plant A (1946) there 
were acute shortages of materials for most Plants, 
including Plant A. At the time of this later study 
there was a period of economic adjustment, or re- 


417 


418 


Table 1 


Weekly Average Output (Percentage Performance of 
Standard) for Group of Machine Operators 


Percentage Number of 

Week Ending Performance Employees 
February 24 122.6 42 
March Es 120.9 39 
10 127.4 33 
17 125.7 32 
24 127.9 29 
31 122.8 29 
April 7 124.9 34 
14 126.8 28 
21 121.9 33 
28 122.9 26 


cession, and business was generally not good. Plant 
B felt the effects of this condition and was laying 
off employees at the time this study was made. 
Thus it will be noted in the Tables in this paper 
that Plant B does not show 42 men for each week. 
Some men were being laid off, and some were re- 
called, during the period of the study. 


Data 


The weekly average output for the group, 
and the number of employees whose output 
was used in this study, are shown in Table 1, 
Inspection of this table shows the percentage 
of performance to standard was quite stable 
but the number of employees generally was 
smaller in each successive week. The per- 


Table 2 


Frequency Distributions of Weekly Average Output 
of All Operators for Two Selected Weeks 


Percentage 
Performance 
to Standard 


No. of 
Employees 
“Most” Week 


No. of 
Employees 
“Least” Week 


155 
145 
135 
125 
115 
105 
95 
85 
75 
65 
Below 64 


BRAD wWaAAS 


> 


ý 
NODU eNe e 


N 


Note.—* = median. 


Harold F. Rothe and Charles T. Nye 


formance was at a high level, since 100% 
was standard. This means that the group of 
employees averaged about 25% pay above 
their base rates. It indicates that the incen- 
tive system was effective since it apparently 
“incentivated” the employees to produce more 
than what management considered to be 
standard production. 

The distribution of output of all operators 
for any one week was somewhat skewed al- 
though they approximated a normal distribu- 
tion. (The output of the various other groups 
of industrial workers reported in previous 
studies was normally distributed.) To show 
the shape of the distribution in this study, 
the outputs for the week when there were the 
most employees and the week when there 
were the least employees were selected. The 


Table 3 


Frequency Distribution of r’s between Successive 
Week’s Output. Individual Performance 
for Group of Machine Operators 


r Frequency 
a EE et i 2 
-91-1.00 1 
81- .90 2 
-71- 80 3 
-61— .70 3 
51- .60 1 


ee ie a 
Note.—Median r = .78, 


distributions of output for these two weeks 
are shown in Table 2. It is interesting t° 
note that the medians for both weeks were 
the same, although in one week there were 4 
employees and in the other week only 26. 
The correlation of each employee’s pel” 
tmance for one week with his performance 
for the following week was determined by thé 
Pearsonian 7 even though the distributions 
were slightly skewed. This was done in ov 
der to facilitate comparisons with the earlier 
studies. The distribution of these r’s is show” 
in Table 3 where the median 7 is .78. 
range of r’s was much smaller than for the 
group of coil winders, previously reporte 
(Rothe & Nye, 1958), and was approximately 
aas as for the chocolate dippers (RotP® 
This 7 of .78 does not quite meet the ” a 
80 that has been hypothesized to indicat? 


fo 


it 


re 


Oe: eee —s 


Output Rates among Machine Operators 


419 


Table 4 


Highest and Lowest Average Weekly Performances, and Their Ratios for Individual Machine Operators 
during 10-Week Period 


Highest Lowest Ratio of Highest Lowest Ratio of 
Weekly Weekly Highest Weekly Weekly Highest 
Employee Average Average to Lowest Employee Average Average to Lowest 
A 130.3 87.4 1.17 S 135.8 76.1 1.78 
B 139.6 101.2 1.38 T 109.4 53.1 2.06 
C 138.8 100.0 1.38 G 133.7 119.0 1.12 
D 140.3 105.7 1.33 Vv 138.1 95.9 144 
E 145.5 127.3 1.14 wW 140.1 135.0 1.04 
F 146.9 121.8 1.21 X 14.9 113.8 1.27 
G 144.6 116.6 1.24 x 147.2 105.3 1.40 
H 154.8 141.8 1.09 Zz 143.1 139.1 1.03 
I 146.3 114.5 1.28 AA 139.2 110.0 1.27 
J 146.3 123.0 1.19 BB 66.9 39.3 1.70 
K 148.0 140.0 1.06 Ce 113.7 47.3 2.40 
L 156.0 102.0 1.53 DD 139.5 95.9 1.45 
M 142.7 129.2 1.11 EE 147.8 142.4 1.04 
N 141.5 85.1 1.66 FF 129.6 70.7 1.83 
10) 147.3 107.3 1.37 GG 139.9 107.7 1.30 
P 142.2 137.5 1.03 HH 143.5 99.5 1.44 
Q 145.0 103.0 1.41 II 137.8 112.0 1.23 
R 118.3 67.0 1.77 JJ 148.5 146.2 1.02 


Note.— Median intra-individual ratio = 1.29. 


effective incentivation, but is obviously so 
close that the difference is statistically insig- 
nificant. 

The output for the most productive and 
least productive weeks for each employee for 
any one of the 10 weeks, and the ratio of 
highest to lowest performance, are shown in 
Table 4. The average (median) ratio of 
these intra-individual differences is 1.29. 

The ratio of best operator to worst opera- 
tor for each week is shown in Table 5 where 
the average (median) ratio of interindividual 
differences is 2.89. Thus the second hypo- 
thetical requirement for effective incentivation 
has been met in this situation, since the ratio 
for interindividual differences exceeds the 
ratio for intra-individual differences. 


Discussion 


Two hypotheses were presented in earlier 
papers, and data from another factory, Plant 
B, were related to these hypotheses. In gen- 
eral, the results obtained in Plant B support 
the two hypotheses. This is interpreted as 
strengthening, but not proving, the hy- 
potheses. 


The week-to-week intercorrelation of out- 
put rates meets (for all practical purposes) 
the criterion of .80 which supposedly indi- 
cates that the incentives to work are effective. 
The average weekly output was about 125% 


Table 5 


Highest and Lowest Average Individual Weekly Per- 
formances and Their Ratios for Group of 
Machine Operators during 
10-Week Period 


Highest Lowest Ratio of 

Employee’s  Employee’s Highest 

Week Ending Average Average to Lowest 
February 24 148.8 39.6 3.96 
March a 148.1 39.3 3.77 
10 148.0 58.6 2.52 
17 147.9 64.8 2.28 
24 151.1 53.1 2.84 
31 147.7 47.3 3.12 
April 7 147.6 61.1 2.37 
14 147.9 50.4 2.94 
21 148.5 42.5 3.49 
28 156.0 55.2 2.82 


Note—Median interindividual ratio = 2.89. 


420 


of standard and this suggests that the incen- 
tives did indeed incentivate. 

The ratio of interindividual differences ex- 
ceeded the ratio of intra-individual differ- 
ences, and this also has been hypothesized as 
existing where the incentives are effective. 

In this situation again, as in the previous 
studies, the week-to-week intercorrelation of 
output rates is low when viewed from the 
standpoint of using production data as a cri- 
terion for some other variable. Psychologists 
would not be impressed greatly by a test that 
had a test-retest reliability of .80, but the cri- 
terion against which they often validate their 


Harold F. Rothe and Charles T. Nye 


tests is rarely this high. This results in the 
unique situation whereby the tests that are 
used are more stable measures than the vari- 
able that the tests are intended to predict. 


REFERENCES 


Rore, H. F. Output rates among butter wrappers: 
II. J. appl. Psychol., 1946, 30, 320-327. 

Rorne, H. F. Output rates among machine opera- 
tors: I. J. appl. Psychol., 1947, 31, 484-489. 

Rorne, H. F. Output rates among chocolate dip- 
pers. J. appl. Psychol., 1951, 35, 94-97. 

Rotar, H. F., & Nye, C. T. Output rates among 
coil winders. J. appl. Psychol, 1958, 42, 182-186. 


(Received April 2, 1959) 


E 
= Journal oj Applied Psychology 
Vol. 43, No. 6, 1959 


A FURTHER INVESTIGATION OF THE PRETEST- 


ROBERT E. LANA 


American University 


It has previously been demonstrated (a) 
that a pretest-treatment interaction effect 
does not occur in the pretest-treatment-post- 
test attitude change research design when the 
attitude involved is of little concern to Ss 

4 (attitude of nonagricultural college students 
T toward vivisection at a time when there were 
no newspaper campaigns on the matter and 

_ before the announcement by the Russians 
that they had launched a dog in a rocket). 
The purpose of this study is to examine this 
basic attitude change research design for a 
_ pretest-treatment interaction effect when the 
= attitude in question is one of somewhat 
= greater concern to Ss than the topic of vivi- 
section. It was inferred that attitude toward 
y ethnic groups would fulfill this requirement 
= on the basis that school integration was a 
= salient topic of discussion at the time in the 
i newspapers and in talks given on campus. 
The Ss were largely inhabitants of southern 
border states, and the city in which they at- 
tended college had integrated its public 
schools just a few years previously. Conse- 
quently, attitude toward vivisection on the 
one hand and attitude toward ethnic groups 
on the other represent some distance on a 
continuum of involvement for Ss used in the 
study and may differentially affect the rela- 
tionship between an attitudinal pretest and a 
persuasive treatment of some kind. A dis- 
cussion of the importance of this effect for 
attitude change methodology is included in 


the study mentioned above (a). 


l TREATMENT INTERACTION EFFECT? 
s 
g 
F 
3 
s 


Method 


Two hundred and twenty-four students in four in- 
troductory classes at The American University served 
as Ss in the experiment. These groups were is 
domly assigned to four treatment conditions pre- 
sented in Table 1, Two of these groups received a 
modified form of the California Ethnocentrism Scale 
consisting of 20 Likert-type items as the pretest atti- 


1Taken from a paper read at the 1959 Eastern 
Psychological Association convention in Atlantic 
City IN: J. 


tude questionnaire. A high score represented high 
ethnocentrism. One of these two groups viewed the 
mental health film (6) on ethnic prejudice, “High 
Wall,” 12 days after taking the pretest. After treat- 
ment, this group (Group I) was immediately post- 
tested with the same questionnaire that served as 
the pretest. The other group (Group IV) was sim- 
ply posttested 12 days later. Group II viewed the 
film and was posttested immediately afterward with- 
out having been pretested. Group III answered the 
questionnaire once. Two other groups, which were 
not included in the experimental design and which 
had a total N of 100, were simply pretested in order 
to examine the comparability of S’s initial attitudes 
on ethnocentrism. For an examination of the in- 
teraction hypothesis a factorial analysis of variance 
was used with .05 as the acceptable level of signifi- 
cance. 


Results 


The four groups of pretest scores, including 
Groups I and IV of the experimental design 
and the two groups not part of this design, 
were submitted to a Bartlett’s Test and found 
to be homogeneous with respect to variance. 
A simple analysis of variance was then per- 
formed on the four pretest means. The re- 
sulting F ratio was not significant at the .05 
level. The Ss were judged to have the same 
initial ethnocentric attitudes in each of the 
groups on the strength of this evidence. 

The variances of the two sets of pretest 
scores of the experimental groups receiving a 
pretest were examined with Bartlett's Test 
and found to be homogeneous. A £ test be- 
tween the means of these two sets of pretest 


Table 1 


Experimental Design 


Groups 
7 I Ir Tit IV 
Pretest Pretest 
Condi- 12 days 12 days 
tions Treatment Treatment 
Posttest Posttest Posttest Posttest 


421 


Taule 2 
Summary of Means and Standard Deviations 
of Posttest Scores 


Groups M SD N 
I z 
(Pretest and 46.04 15.6 52 
communication) 
II o 9 Ses 
(No pretest and 45.70 13.5 37 
communication) 
HI 
(No pretest and no 51.74 13.9 58 
communication) 
W soos ey 
(Pretest and no 55.65 14.4 57 


communication) 


scores was not significant at the .05 level. 
The conclusion is drawn that the groups re- 
ceiving the pretest in the experiment were 
initially homogeneous with respect to attitude 
toward ethnic groups. ’ 

Means and standard deviations of the post- 
test scores appear below. A Bartlett's Test 
was then applied to.the four posttest results 
ot signifi- 
iance was 
ns for the 
nalysis of 
variance results appears in Table 3. The F 


í significant 
at the .05 level which implies that the film 


about eth- 


tive importance to them. 


Robert E. Lana 


Table 3 
Analysis of Variance on Post! s. Scores 
Source dq SS Ms R P 

Treatment 1 61.23 61.23 1669 <05 
Pretest 1 4.52 4.52 1.20 >.05 
T X P (Inter- 1 318 318 <1 >.05 

action) 
Error* 221 3.68 

* Error term was computed by the Walker and Lev simple 
approximation method for unequal Vs. 

Discussion 


Since the F ratio representing the interac- 
tion effect of pretesting and treatment was 
insignificant, it can be concluded that the act 
of pretesting a group of Ss with a question- 
naire does not influence their subsequent re- 
actions to a persuasive appeal in terms of atti- 
tude change toward a topic of some impor- 
tance to them. Apparently, an attitudinal 
pretest has no effect on the reception of a 
succeeding persuasive communication within 
the limits of involvement of S with the topi- 
cal continuum represented by vivisection (a) 
at one point and ethnic prejudice at another. 
Perhaps the interaction effect between pretest 
and treatment occurs only when the com- 
munication presents information to § such 
that a learning situation is involved „rather 
than a change of attitude. This contention 
remains to be experimentally determined. 


REFERENCES 

Lana, R. E. Pretes 

attitudinal studies 
300. 

UniteD STATES DEPARTMENT OF HEALTH, EpuCATION, 

& WELFARE, Public Health Service. Mental health 


motion pictures: 4 selective guide. Washington, 
1952, 


t-treatment interaction effects in 
» Psychol. Buil, 1959, 56, 293- 


(Received April, 9, 1959) 


